wholemoley's comments

wholemoley · on March 19, 2019

As I mentioned in the other post, the curiosity bots can help with this.

They're rewarded by exploring, not by values in game.

Maybe alone they're not enough, but in conjunction with other things, I bet we could beat Zelda. I had the bot exploring enough to find the first dungeon of LoZ.

chongli · on March 20, 2019

I just took a look at the curiosity video. It's funny, I think with enough refinement something like this could beat Zelda. Except it wouldn't actually know it beat the game! I feel like that is cheating, somehow.

Like maybe you could get American Fuzzy Lop [1] to beat Zelda. Isn't that the same thing, in principle?

[1] http://lcamtuf.coredump.cx/afl/

lzybkr · on March 20, 2019

AFL might eventually get lucky, but I'm guessing you'd want a hybrid approach that combines symbolic execution with a fuzzer like AFL, e.g. QSYM [1].

[1] https://www.usenix.org/system/files/conference/usenixsecurit...

wholemoley · on March 19, 2019

This person https://github.com/pathak22/noreward-rl and his curiosity AI https://github.com/openai/large-scale-curiosity can do some cool stuff.

I used it to train an AI to play Mario Kart ( https://www.youtube.com/watch?v=A8oSnh0M864 )

I also had it playing The Legend of Zelda (got as far as finding the first dungeon, but with more power, I'm certain it could explore the whole map.).

wholemoley · on March 18, 2019

I made a seq2seq chatbot with Matrix/Riot, the python-api implementation (https://github.com/matrix-org/matrix-python-sdk) and https://github.com/tensorlayer/seq2seq-chatbot:

Here it is: https://www.youtube.com/watch?v=rCggOcKZn-c

(There's a fun interaction at 25:52)

Hendrikto · on March 18, 2019

That triple nested if is horrible though…

wholemoley · on Oct 1, 2018

Mark's Handbook for Mechanical Engineers, and AREMA.

wholemoley · on Sept 28, 2018

The article is from last year but it's still extremely valuable and interesting.

Exploring this topic is currently my primary hobby. Specifically, I've been using OpenAI's retro (Sonic, Contra, Mario, Donkey Kong and, more recently FZero) and comparing the ancient NEAT with more fashionable stuff like DQN, PPO, A3C and DDPG.

With my extremely limited experience, NEAT seems to outperform all of these other algorithms. I believe the advantage is the potential for strange/novel network structure.

And the best part is that NEAT doesn't require a powerful GPU.

Apologies for the shameless plug but here's a link to a series on youtube I made about using Retro and NEAT together to play Sonic. https://www.youtube.com/watch?v=pClGmU1JEsM&list=PLTWFMbPFsv...

i_phish_cats · on Sept 28, 2018

You are evolving the topology, but using regular gradient descent/backprop for any given network, correct?

jawarner · on Sept 28, 2018

No, in NEAT both the weights and topology are evolved. It is totally gradient-free.

wholemoley · on Sept 29, 2018

Yeah, topology and weights. It's highly subject to initial conditions. You almost need another NEAT network to evolve the initial conditions. I believe it's turtles all the way down.

wholemoley · on Sept 25, 2018

http://files.80s.nyc/photos/1/00189/0035.jpg

Ghostbusters!

anta40 · on Sept 26, 2018

Whoa!

wholemoley · on Sept 20, 2018

If you do decide to do an evolutionary-based policy, I highly recommend messing around with NEAT (https://github.com/CodeReclaimers/neat-python).. I have successfully used it to play a number of SNES/Genesis games.

I even made a tutorial series. https://www.youtube.com/watch?v=pClGmU1JEsM&list=PLTWFMbPFsv...

Apologies for the shameless plug.

wholemoley · on July 22, 2018

I've been using the python-neat library in open-ai's retro with some success. And while it works quickly, it normally finds local maximas. It seems to struggle with long sequences. And defining the fitness function/parameters is an artform.

Here's a video of Donkey Kong Country played by python-neat in open-ai's retro. It took 8 generations of 20 genones to beat level one. I'll post the code if anyone's interested.

https://vimeo.com/280611464