discountsocialism

joined 4 years ago
[–] discountsocialism@hexbear.net 4 points 7 hours ago (1 children)

Thats correct. A decision tree is very similar to a classification task on some input state. So we invert the model so we just need to determine what actions are valid, give a negative reward to invalid actions, then let the algorithm figure out the best sequence to take.

Getting a story relevant action sequence is something else. Let’s say you have a goal of “unlock the door”. This is semantically similar to “Get key. Use key on locked door” (when using text embeddings and cosine similarity). I’m using proximal policy optimization / MCTS to find the sequence of events that best fits the goal narrative by simulating the environment. For more complex actions like “leave through the exit”, I’m using an llm to generate intermediate goals in plain english. There are a lot of limitations and it’s very experimental but it works well enough to allow arbitrary goals.

It is less computationally complex than an exhaustive search and it is also ‘online’ so we can use and continue to train it, we just get more optimal actions over time.

[–] discountsocialism@hexbear.net 6 points 7 hours ago (4 children)

I’m making a bot for my roguelike game and I’m getting a lot of pushback from my friends for using “AI”. I’m making little autonomous bots that use textual embeddings and a sprinkle of local llms to help plan their actions so they can seem real. No ai slop is shown to the user, it’s all behind the scenes for them to plan their actions and navigate the world as I work around the limitations of reinforcement learning. I publish my experiments on a microblog and apparently people were gossiping that “i was doing something bad (with ai)” and I was met with a lot of hostility when I tried to share it with friends. Made me feel really bad but maybe they are right?

[–] discountsocialism@hexbear.net 4 points 4 weeks ago (1 children)

Two can play that game. Put a big number then negotiate down before you sign.