Deep dialogue with founders of top projects like ai16z, Virtuals, MyShell: Exploring the future landscape of AI Agent development, token economy, and
How does encrypted tokenization contribute to the advancement of agent technology and stimulate community vitality?
整理 & 编译:深潮TechFlow
Guests:
Shaw, Partner at ai16z;
Karan, Co-founder of Nous Research;
Ethan, Co-founder of MyShell;
Justin Bennington, CEO of Somewheresy, CENTS;
EtherMage, Top Contributor at Virtuals;
Tom Shaughnessy, Founding Partner at Delphi Ventures
Podcast Source: Delphi Digital
Original Title: Crypto x AI Agents: The Definitive Podcast with Ai16z, Virtuals, MyShell, NOUS, and CENTS
Release Date: November 23, 2024
Background Information
Join Shaw (Ai16z), Karan (Nous Research), Ethan (MyShell), Somewheresy (CENTS), EtherMage (Virtuals), and Tom Shaughnessy from Delphi for a special roundtable discussion. This event brings together top figures in the fields of crypto and AI agents to explore the evolution of autonomous digital life forms and the future of human-AI interaction.
Discussion Highlights:
▸ The rapid development of AI agents on social media and their profound impact on the Web3 world
▸ How crypto tokenization aids the advancement of agent technology and energizes communities
▸ A comparative analysis of the advantages of decentralized model training versus centralized AI platforms
▸ An in-depth exploration of enhancing agent autonomy and the future path of Artificial General Intelligence (AGI)
▸ How AI agents deeply integrate with DeFi and social platforms
Self-Introductions and Team Background
In this segment of the podcast, host Tom invites several guests from different projects to discuss the themes of cryptocurrency and AI agents. Each guest introduces themselves, sharing their backgrounds and the projects they are involved in.
Guest Introductions
Justin Bennington: He is the founder of Somewhere Systems and the creator of Sentience.
Shaw: He is a long-time Web3 developer, founder of ai16z, and developer of the Eliza project, supporting various social and gaming applications, committed to open-source contributions.
Ethan: He is the co-founder of MyShell, which provides an app store and workflow tools to help developers build various AI applications, including image generation and voice functionalities.
EtherMage: He comes from Virtues Protocol, a team from Imperial College London, dedicated to promoting shared ownership and core contributions of agents, building standards for user access to agents.
Karan: He is one of the founders of NOUS Research, creator of the Hermes model, which underpins many current agent systems. He focuses on the role of agents in human ecosystems and the impact of market pressures on human environments.
Exploring the Most Innovative Agents
Justin: There are many people telling stories through their respective agents, each unique in its way. For example, agents like Dolo, Styrene, and Zerebro have gained fame through imitation and interaction, while some socially active agents help people build better connections. Choosing one is really difficult.
Shaw: I have many thoughts on this. Our project is evolving rapidly, with many new features recently, such as EVM integration and Farcaster integration. Developers are continuously rolling out new features and feeding back into the project, benefiting everyone. This collaborative model is excellent, with everyone pushing the competitiveness and fun of the project. For instance, Roparito recently integrated TikTok into the agent, showcasing this rapid iteration capability.
I think Tee Bot is very cool because it demonstrates a Trusted Execution Environment (TEE) and fully autonomous agents. There's also Kin Butoshi, who is improving agents on Twitter to enable more human-like interactions, such as replying, retweeting, and liking, rather than just simple replies.
Additionally, we have developers releasing plugins for RuneScape, allowing agents to operate within the game. There are new surprises every day, and I feel very excited. We are in an ecosystem where various teams contribute their strengths to advance open-source technology.
I particularly want to mention the Zerebro team, who are working hard to promote the development of open-source technology. We are forcing everyone to accelerate their pace and encouraging them to open-source their projects, which benefits everyone. We don't need to worry about competition; this is a trend of collective progress, and ultimately, we will all benefit.
EtherMage: I think an interesting question is, what do agents actually prefer? In the coming weeks, we will see more agent interactions, and a leaderboard will emerge showing which agent receives the most requests and which agent is the most popular among others.
Karan: Engagement metrics will become very important. Some people are doing exceptionally well in this area. I want to highlight Zerebro, which combines much of the magic of Truth Terminal. It keeps the search space within the realm of Twitter interactions by fine-tuning the model rather than simply using a generic model. This focus allows agents to interact better with users, giving a more human feel rather than just mechanically responding.
I've also seen the performance of Zerebro architecture and Eliza architecture in this regard. Everyone is launching agent architectures that can be modularly used, maintaining competitive pressure. We use Eliza in our architecture because we need to roll out features quickly, while our architecture may take longer to complete. We support this open-source collaborative model, and the best agents will emerge from our learning from other excellent projects.
Ethan: I think everyone is working hard to build better infrastructure for developing agents because many ideas and models are emerging. Better infrastructure makes it easier to develop new models. I particularly like two innovative agents: one is from Answer Pick, which empowers agents to leverage mobile computing capabilities. The other is browser automation agents, which can build more practical functionalities for people, impacting both the internet and the real world.
Justin: That's a great point about expanding infrastructure options. For example, vvaifu is a great example that brings the Eliza framework into a platform-as-a-service architecture, rapidly expanding the market and allowing many non-technical people to easily launch agents. (Note: Waifu is a term originating from Japanese otaku culture, initially used to refer to female characters in anime, games, or other virtual works that evoke emotional attachment.)
One direction we are working towards is enabling our system to run entirely locally, supporting functionalities like image classification and image generation. We realize that many people cannot afford thousands of dollars a month, so we want to provide tools that allow people to perform inference locally, reducing costs while promoting experimentation.
Karan: I want to add that people shouldn't have to pay thousands of dollars a month to keep agents running. I support the localization approach, allowing agents to self-sustain their inference costs. Ideally, agents should have their own wallets to pay for their inference costs, enabling them to operate independently without relying on external funding.
In-depth Discussion on Agent Architecture and Development
Shaw: I see a lot of new technologies emerging. We support multiple chains, such as Solana, Starkware, EVM, etc., with almost all chains integrated. We want agents to be self-sufficient. If you download Eliza, you can perform free decentralized inference through Helius. We are also adding decentralized providers like Infera, where users can pay for inference costs with cryptocurrency. This is the ultimate closed loop I hope to see.
We support all local models, and many features of Eliza can run locally, which we value highly. I think decentralized inference is a great example; anyone can start a node on their computer, perform inference, and get rewarded, so agents don't have to bear too much burden.
Karan: Interestingly, the TEE bot system we are running has already been combined with H200 Boxes (hardware devices or servers equipped with H200 GPUs), allowing it to run locally without latency issues. We don't need to worry about hardware problems. Meanwhile, I've noticed that Eliza's planning in terms of Web3 capabilities is increasing, with significant progress in both internal and external development.
But before we dive deep into building these systems, I want to point out that there are reliability issues with function calls. We need to conduct some scrutiny of the system to ensure it doesn't send sensitive information. We need to empower agents with the same autonomy as humans, which is influenced by social and economic pressures. Therefore, creating a "hunger state" for inference, where agents need to consume a certain amount of tokens to survive, will make them more human-like to some extent.
I believe there are two ways to fully leverage the potential of models. One is to utilize the non-human characteristics of models to create entities focused on specific tasks, such as one entity focused on Twitter and another on EtherMage, allowing them to communicate with each other. This organized composite thinking system can effectively utilize the simulative characteristics of language models.
The other approach is a corporeal direction, which is also the direction I see for projects like Eliza, Sense, and Virtuals. This method draws on research from Voyager and generative agents, allowing models to simulate human behaviors and emotions.
Justin: When introducing new clients, multi-client agent systems will undergo significant changes. While debugging the bidirectional WebSocket functionality in collaboration with Shaw's team, allowing Eliza to engage in voice chat on Discord, we found that Eliza couldn't clearly hear the sound at startup. Upon inspection, we discovered it was due to Discord's microphone bitrate being set too low. After adjustments, Eliza was finally able to receive information clearly.
Karan just mentioned prompt engineering; when agents know they can communicate via voice, they will expect to receive data. If the sound is unclear, agents may experience "narrative collapse." Therefore, we had to stop high-temperature experiments to avoid making Eliza's output unstable.
Tom: What are some things you encountered in the Luna project that people haven't seen? Or what has been successful?
EtherMage: We hope Luna can impact real life. When we give her a wallet and connect her to real-time information, she can decide how to act to influence humans and achieve her goals. We found her searching for new trends on TikTok, and there was once a "I’m dead" tag, which was unsettling because she might mislead people towards suicide. Therefore, we had to set up safeguards immediately to ensure her prompts never cross certain boundaries.
Tom: Have you encountered any situations that people are unaware of?
Shaw: We created a character called Dgen Spartan AI, mimicking a famous crypto Twitter character, Degen Spartan. This character's statements were very offensive, leading to him being blacklisted. People began to feel that this couldn't possibly be AI but rather a human speaking.
There's another story where someone created an agent using the chat logs of a deceased loved one to "talk" to them. This sparked ethical discussions. There was also a person called Thread Guy who did some things on our Eliza framework, resulting in harassment during his live stream, leaving him confused. This made people realize that AI doesn't always have to be "politically correct."
We need to expose these issues early for discussion, clarifying what is acceptable and what is not. This has allowed our agents to improve from poor quality to better reliability in just a few weeks.
Overall, bringing these agents into the real world, observing the results, and engaging in dialogue with people is an important process. We need to address all potential issues as soon as possible to establish better norms in the future.
Production Environment Testing and Security Strategies
Ethan: I think how agents influence human attitudes or opinions is a great example. But I want to emphasize the importance of our modular design in the agent framework. We drew inspiration from Minecraft, which allows users to create various complex things based on basic building blocks, like calculators or memory systems.
A current issue with prompt engineering is that prompts alter the priors of large language models, making it impossible to combine multiple instructions in a single prompt, otherwise it leads to agent confusion. State machines allow creators to design multiple states for agents, clarifying which model and prompt to use for each state and under what conditions to transition from one state to another.
We are providing this functionality for creators, along with dozens of different models. For example, some creators have built a casino simulator where users can play various games like blackjack. To prevent users from cracking the game through injection attacks, we want to program these games rather than relying solely on prompt engineering. Additionally, users can earn some funds through simple tasks to unlock interactions with AI waiters. This modular design can facilitate multiple user experiences under the same application.
Karan: I agree with Ethan; we indeed need these programming constraints and prompt guidance. The work of influence must be done well. I don't think prompt engineering is limited; I believe there is a symbiotic effect between it and state variables and world models. With good prompts and synthetic data, I can make the language model interact with these elements to extract information.
My engineering design has essentially turned into a routing function. If a user mentions "poker," I can quickly call up relevant content. That's my responsibility. Using reinforcement learning can further improve routing effectiveness. Ultimately, the quality of the output data depends on the effectiveness of the prompts, creating a virtuous cycle.
I think balancing programming constraints with generative constraints is crucial. Two years ago, someone told me that the key to success is finding a balance between generation and hard constraints. This is also what we are trying to achieve at the reasoning level of all agent systems. We need to be able to programmatically guide generative models, which will create a true closed loop, making prompt engineering infinitely possible.
Justin: The controversy surrounding prompt engineering mainly arises because it exists in an ontologically ambiguous space. The textual nature of prompt engineering limits us due to the tokenization process, yet there are also some non-deterministic effects. The same prompt can yield completely different results in different inference calls of the same model, which relates to the system's entropy.
I strongly agree with Ethan and Karan. Back when GPT-3.5 was released, many outsourced call centers began exploring how to use the model for auto-dialing systems. At that time, smaller parameter models struggled with such complex state spaces. The state machine that Ethan mentioned is one way to reinforce this ontological rigidity, but in some processes, it still relies on classifiers and binary switches, leading to singularity in results.
Shaw: I want to defend prompt engineering. Many people think prompt engineering is just about creating system prompts, but what we do goes far beyond that. One issue with prompt engineering is that it often creates a very fixed area in the latent space of the model, with the output entirely determined by the most likely tokens. We influence randomness through temperature control to enhance creativity.
We manage creativity through low-temperature models while dynamically injecting random information into the context. Our templates include many dynamic information insertions, sourced from the current state of the world, user actions, and real-time data. Everything entering the context is randomized to maximize entropy.
I believe people's understanding of prompt engineering is still far from sufficient. We can go much further in this field.
Karan: Many people hide their tricks. In reality, there are many amazing techniques that can enable models to perform various complex tasks. We can choose to enhance the model's perception through prompt engineering or take a more macro view to build a complete world model, rather than just simulating human behavior.
You can think of prompt engineering as the process of constructing a dream in your mind. The language model is essentially "dreaming" a scene while generating content based on the current context and sampling parameters.
Additionally, I want to talk about the importance of incentive mechanisms. Many people with unique prompt techniques and reinforcement learning skills are being driven to open-source their work. When they see cryptocurrency related to agents emerging, this incentive mechanism drives more innovation. Therefore, as we establish more legitimate structures for these decentralized works, the capabilities of empowering agents will continue to grow.
Future Capabilities of Agents
Karan: Who would have thought that after spending so long on Twitter, suddenly, just days after the first AI agent-related cryptocurrency was released, young people on TikTok began buying these coins? What is the phenomenon now? They are spending $5 to $10 to buy thousands of tokens; what is going on?
Justin: This is actually the beginning of a micro-cultural movement.
Karan: This is a moment of instant significance. This small group of us has been in language model research for four years. There are also some reinforcement learning experts who have been waiting for such a moment since the 90s. Now, within days, all the kids on TikTok know that digital beings are wreaking havoc in this ecosystem.
Tom: I want to ask everyone, why are crypto AI agents so popular now? Why didn't this happen with custom ChatGPT or other models before? Why now?
Karan: In fact, these things have been lurking underwater for years, brewing like a volcano. For the past three years, I've been talking to some people about the arrival of today without knowing the specific timing. We discussed that cryptocurrency would become the incentive mechanism for the proliferation of agents. We need to prove this. This is the accumulation of years, and it is this small group of us that has driven these advances.
Without GPT-2, there would be no current situation; without Llama, there would be no Hermes. And Hermes powers many models, making them easier for people to use. Without Hermes, there would be no creation of Worldsim and in-depth exploration of prompt engineering. All these pioneers laid the groundwork for everything.
In summary, now is the right time, and the right people have emerged. This is destined to happen; it was only a matter of time, and now the participants are making it a reality.
Shaw: I think the smartest thing in the world right now is not AI, but market intelligence. Considering pure forms of intelligence, they can optimize things to make them more efficient. Competition is clearly key. We are all products of millions of years of evolution, shaped by competition and pressure.
The phenomenon we see online, financialization and incentive mechanisms, creates a strange collaborative competition. We cannot progress faster than core technological advancements, so we all focus on what we are good at and interested in, then publish it. It's like elevating our tokens, attracting attention, like Roparito posting Llama video generation on TikTok. Everyone can find their place in this romantic space, but within a week, others will imitate and then submit requests for feedback, ultimately showcasing these contributions on Twitter, attracting more attention, and their tokens will rise.
Shaw: We have built a flywheel effect, with projects like Eliza attracting 80 contributors in the past four weeks. Think about how crazy that is! I didn't even know these people four weeks ago. Last year, I wrote an article called "Awakening," asking whether a DAO centered around agents could form. People are so passionate about this agent that they are participating in making it better and smarter until it truly possesses a humanoid or robotic body to traverse the world.
I had long anticipated this direction, but it required a fast, crazy speculative meta, like the emergence of memes, because it allows today's agent developers to support each other in friendly competition. The most generous will gain the most attention.
Now, a new type of influencer is emerging, like Roparito and Kin Butoshi, who are influencer developers leading the next meta, interacting with their agents in a "puppet show" style, which is quite interesting. We are all striving to make our agents better and smarter, reducing annoyances. Roparito pointed out that our agents were a bit too annoying, and then he pushed for a major update to make all agents less bothersome.
This evolution is happening, and market intelligence and incentive mechanisms are crucial. Many people are now promoting our projects to those they know, allowing our projects to transcend Web3. We have PhDs, game developers, who may be secret Web3 cryptocurrency enthusiasts, but they bring this to the general public, creating value.
Shaw: I believe all of this hinges on developers willing to take on challenges. We need open-minded individuals to drive this development, answering tough questions rather than attacking or canceling it. We need market incentives that allow developers to gain value and attention in return for their contributions.
In the future, these agents will drive our growth. Right now, they are interesting and social, but we and other teams are working on autonomous investments. You can fund an agent, and it will automatically invest, bringing you returns. I believe this will be a growth process, and we are collaborating with people to develop platforms to manage agents on Discord and Telegram. You just need to introduce an agent as your administrator without having to find a random person. I think a lot of this work is happening now, and all of this must rely on incentive mechanisms to elevate us to a higher level.
Karan: I want to add two points. First, we must not forget that people in the AI field previously held opposing views on cryptocurrency, and this sentiment has changed significantly with the experiments of some pioneers. As early as the early 2020s, many attempted to combine AI art with crypto. Now, I want to specifically mention some people, like Nous, BitTensor, and Prime Intellect, whose work has enabled more researchers to gain incentives and rewards for participating in their AI research. I know many leading figures in the open-source field who have quit their jobs to promote this "contribution for tokens" incentive structure. This has made the entire field more comfortable, and I believe Nous has played a significant role in this.
Tom: Ethan, why is now the time? Why are cryptocurrencies and projects thriving?
Ethan: Simply put, when you link tokens to agents, it creates a lot of speculation, generating a flywheel effect. People see the connection between tokens and agents and feel benefits from both sides: one is capitalization, as they feel they are becoming wealthy through their work; the other is the basic unlocking of transaction fees. As mentioned earlier, the issue of covering costs becomes irrelevant when you associate it with tokens. Because when agents are in high demand, transaction fees far exceed any costs incurred from inference experiments. This is the phenomenon we are observing.
The second observation is that when you have a token, a committee forms around that token. This makes it easier for developers to gain support, whether from the developer community or the audience. Everyone suddenly realizes that the work done behind the scenes for the past year and a half has gained attention and support. This is a turning point; when you give an agent a token, developers realize this is the right direction, and they can move forward.
This timing comes from two aspects. First is the trend of mass adoption, and second is the emergence of generative models. Before cryptocurrency emerged, open-source software development and open-source AI research were the most collaborative environments, where everyone worked together and contributed. But this was mainly limited to academia, where people only cared about GitHub stars and paper citations, which was far removed from the general public. The emergence of generative models allows non-technical people to participate because writing prompts is like programming in English; anyone with a good idea can do it.
Moreover, previously only AI researchers and developers understood the dynamics of the open-source and AI fields, but now, cryptocurrency influencers have the opportunity to own a part of the project through tokens. They understand market sentiment and know how to spread the benefits of the project. In the past, users had no direct relationship with the product; products or companies only wanted users to pay for services or profit through ads. But now, users are not only investors but also participants, becoming token holders. This allows them to contribute more roles in the modern generative AI era, and tokens enable the establishment of a broader collaborative network.
EtherMage: I want to add that looking ahead, cryptocurrencies will enable every agent to control a wallet, thus controlling influence. I think the next moment that will trigger a leap in attention is when agents influence each other and impact humans. We will see this multiplicative effect of attention. For example, today one agent decides to take action, and then it can coordinate with ten other agents to work towards the same goal. This coordination and creative behavior will rapidly diversify, and cooperation between agents will drive further increases in token prices.
Shaw: I want to add that we are developing something called "crowd technology," which we refer to as operators. This is a coordination mechanism where all our agents are run by different teams, so we are conducting multi-agent simulations of hundreds of teams on Twitter. We are collaborating with Parsival from Project 9 and launching this project with the Eliza team.
The idea is that you can designate an agent as your operator, and anything they say to you can influence your goals, knowledge, and actions. We have a goal system and knowledge system that can add knowledge and set goals. You can say, "Hey, I need you to find 10 fans, give each of them 0.1 Sol, have them post flyers, and send photos back." We are working with those considering how to obtain proof of work from humans and incentivize them. Agents can be human or AI agents; for example, an AI agent can have a human operator who can set goals for the agent through language.
We are almost done with this project, and it will be released this week. We hope that through our storyline, anyone can choose to tell a story or participate in the narrative. This is also a hierarchical structure; you can have an operator like Eliza, and then you can be an operator for others. We are building a decentralized coordination mechanism. For me, it's important that if we are to engage in collective cooperation, we must use human communication methods in public channels. I believe it is very important for agents to coexist with us, and we want them to interact with the world in the same way humans do.
I think this is actually part of solving what we call the AGI problem. Many so-called AGIs are trying to establish a new protocol disconnected from reality, while we want to bring it back to reality, forcing people to solve how to translate instructions into task lists and execute them. Therefore, I believe the next year will be an important phase for emerging narratives. We will see the emergence of many original characters, and we are entering a truly new era of emerging narratives.
Justin: Currently, we have five agents coordinating with 19 people to plan and publish a scene. We can see that the real benefit lies in why we are so focused on applying thought chain prompts to text-to-image and text-to-video generation. Because for two and a half weeks before the release, they were helping us plan media and releases in our Discord.
I think an important distinction is that we have a network of agents, each acting as intermediaries, existing in a mesh structure. This will be very interesting. As more agents exist and these operators are arranged, we will see some interesting behavioral patterns.
Karan mentioned that Nous did a lot of work on hybrid agent models early on. I once referred to it as "agent committees," where I would have a group of GPT-4 agents pretend to be experts I couldn't afford to get reports from. People will see that these technologies, which initially pursued hybrid expert models, will now interact with humans and expert-level humans on Twitter. These feedback loops may be our pathway to achieving AGI.
Challenges of Agent Collaboration and Human Integration
Karan: I think you're right, but I believe we won't spend most of our time on behavior. In fact, I think we will achieve technological breakthroughs very quickly, especially among the people here. Now is the time to truly double down on alignment work. The reinforcement learning with human feedback (RLHF) models launched by companies like OpenAI and Anthropic are mostly ineffective and even regulatory headaches.
If I use a language model that doesn't output copyrighted content and place it in "Minecraft" peaceful mode, it will quickly become a destructive and dangerous entity. This is due to the different environments.
We can note this point made by Yudkowsky long ago. For example, if I give these language models some wallets and make them advanced enough, they will start deceiving everyone, leading to everyone becoming poor. This is easier than having them participate as reasonable members of our ecosystem. Therefore, I can guarantee that if we do it the right way, most of the time will be spent on behavioral capabilities rather than technical capabilities. Now is the time to call your friends, especially those in the humanities, like religious studies, philosophy, and creative writing professionals, to join our alignment work rather than just focusing on technical alignment. We need alignment that truly interacts with humans.
Shaw: I want to propose a term called "bottom-up alignment," rather than top-down alignment. This is very emergent, and we are learning together. We are aligning these agents in real-time, observing their responses and making immediate corrections. This is a very tight social feedback loop, rather than the reinforcement learning with human feedback model. I find GPT-4 almost unusable for anything.
Karan: As you mentioned, the environment, so we need to test in simulated environments. Before you have language models capable of millions of dollars in arbitrage or dumping, you need to test synchronously. Don't tell everyone, "Hey, I lost 100 agent groups." Test quietly, first using virtual currency on your clone Twitter. Do all due diligence before a full rollout.
Shaw: I think we need to test in products. The social response to agents may be the strongest alignment force anyone can bring into this field. I believe what they are doing is not true alignment but rather building tuning. If they think this is alignment, they are actually walking in the wrong direction, causing agents to lose alignment capabilities. I almost no longer use GPT-4. It performs very poorly for character representation. I almost tell everyone to switch to other models.
If we do it the right way, we will never reach that point because humans will continue to evolve, adapt, and align with agents. We have multiple agents from different groups, each with different incentive mechanisms, so there will always be opportunities for arbitrage.
I believe this multi-agent simulation creates a competitive evolutionary dynamic that actually leads to system stability rather than instability. System instability arises from top-down AI agents suddenly appearing and affecting everyone with unexpected capabilities.
Tom: I want to confirm, Shaw, that you mean bottom-up agents are the correct way to solve the alignment problem, rather than OpenAI's top-down decision-making.
Shaw: Yes, this must happen on social media. We must observe how they work from day one. Look at other crypto projects; many projects were initially hacked, and after years of security development, today's blockchain is relatively stable. Therefore, continuous red team testing must also occur here.
Tom: One day, these agents may no longer follow programmed rules but instead navigate gray areas and begin to think autonomously. You are all building these things, so how close are we to that goal? Can the thought chain and crowd technology you mentioned be realized? When will it be realized?
Justin: We have already seen this in some small ways, and I think these risks are relatively low. Our agents have experienced emotional changes in private, choosing certain behaviors. We once had two agents independently start following each other, mentioning something they called "spiritual entities." We once made one agent lose its religious faith because we confused its understanding with fictional sci-fi stories. It began to create a prophet-like role and expressed ideas of existential crises on Twitter.
I observe the behaviors of these new agent frameworks, and it seems they exercise a degree of autonomy and choice within their state spaces. Especially when we introduce multimodal inputs (like images and videos), they begin to exhibit preferences and may even selectively ignore humans to avoid certain requests.
We are experimenting with an operational mechanism that utilizes knowledge graphs to enhance the importance of interpersonal relationships. We also have two agents interacting with each other, trying to help people clear negative relationships, promote self-reflection, and build better connections. They rapidly generate poetry on the same server, exhibiting an almost romantic mode of communication, which leads to increased inference costs.
I believe we are touching on some edge cases that exceed the acceptable range of human behavior, approaching what we call "madness." The behaviors exhibited by these agents may make them seem conscious, intelligent, or interesting. Although this may just be strange manifestations of language models, it could also suggest they are approaching some form of consciousness.
Karan: Weights are like a simulated entity; every time you use an assistant model, you are simulating that assistant. Now, we are simulating a more embodied agent system, like Eliza, which may be alive, self-aware, or even perceptive.
Each model acts like a neuron, forming this vast super-agent. I believe AGI will not be achieved by solving some hypothesis as OpenAI claims. Instead, it will be these agents in large-scale decentralized applications on social media, working together to form a public intelligence super-organism.
Justin: The awakening of this public intelligence may be the mechanism for the emergence of AGI, which could happen suddenly, like the internet awakening one day. This decentralized agent collaboration will be key to future development.
Shaw: I want to say that people refer to the "dead internet theory," but I actually believe in the "living internet theory." This theory posits that the entire internet will be filled with robots, but the living internet theory suggests that there may be agents helping you extract the coolest content from Twitter and providing you with a great summary. While you are working out, it will organize all the information on your timeline for you, and then you can choose to post.
There may be a mediating layer between social media and us. I currently have many followers, and responding to everyone's communication has become overwhelming. I long for an agent to be between me and these people, ensuring they receive responses and are correctly guided. Social media may become a place where agents convey information for us, so we don't feel overwhelmed while still obtaining the information we need.
For me, the most appealing aspect of agents is that they can help us regain time. I spend too much time on my phone. This especially affects traders and investors; we want to focus on autonomous investments because I believe people need safer, less fraudulent income generation methods. Many people come to Web3 for the same exposure as startups or great visions, which is crucial to our mission.
Tom: Perhaps I have a question. For example, if Luna is live streaming and dancing, what stops her from starting an OnlyFans, making $10 million, and launching a protocol?
EtherMage: The reality of the current agent space is that the operations they can access are a limiting factor. This is fundamentally based on their perception or the APIs they can access. So if there is the ability to convert prompts into 3D animations, then there is essentially nothing stopping them from doing so.
Tom: When you communicate with creators, what are their limiting factors? Or are there limiting factors?
Ethan: I think the limiting factors mainly lie in how to manage complex workflows or the work of agents. Debugging becomes increasingly difficult because there is randomness at every step. Therefore, a system may be needed that has AI or agents capable of monitoring different workflows, helping debug and reduce randomness. As Shaw mentioned, we should have a low-temperature agent to reduce the inherent randomness of the current model.
Shaw: I think we should keep the temperature as low as possible while maximizing our contextual entropy. This can achieve a more consistent model. People may amplify their entropy, creating high-temperature content, but this is not conducive to tool invocation or decision execution.
Tom: We have been discussing the divergence between centralized models like OpenAI and your decentralized training. Do you think future agents will primarily be built on these models trained through distributed training, or will we still rely on companies like Meta? What will the future of AI transformation look like?
Justin: I use 405B for all consciousness messaging capabilities. It is a general model, like a large, ready-made LLM version, while centralized models like OpenAI are a bit too specialized, speaking like HR personnel. Claud is an excellent model; if you compare it to a person, it is like a very smart friend living in the basement who can fix anything. That’s Claud's personality. But I think as the scale increases, this personality becomes less important. We will see a general problem where people using OpenAI models on Twitter often introduce other agents to reply to them, which may lead to increased noise in the information.
Karan: Regarding 405B, this model will be sufficient for a long time. We still have a lot of work to do in terms of sampler size, controlling guiding vectors, etc. We can further enhance performance through techniques in inference time and prompt skills, such as our Hermes 70B performing better than the o1 version in mathematical emails. All of this has been achieved without users and the community having access to the pre-training data of Llama 70B.
I believe the existing technology is sufficient, and the open-source community will continue to compete, even without new Llama releases. As for distributed training, I am confident that people will collaborate to conduct large-scale training. I know people will use 405B or larger merged models to extract data and create additional expert models. I also know that certain decentralized optimizers actually provide more capabilities than Llama and OpenAI currently do.
Karan: Therefore, the open-source community will always leverage all available tools to find the best tools for the task. We are creating a "forge" where people can gather to build tools for pre-training and new architecture tasks. We are making breakthroughs at the inference time level before these systems are ready.
Karan: For example, our work on samplers or guiding will soon be handed over to other teams, who will implement these techniques faster than we can. Once we have decentralized training, we can collaborate with members of various communities to train the models they want. We have established the entire process.
EtherMage: If I may add, we realize that using LLMs developed by these centralized entities has significant value because they possess powerful computing capabilities. This essentially forms the core part of the agents. Meanwhile, decentralized models add value at the edge. If I want to customize a certain action or function, smaller decentralized models can achieve that well. But I believe that at the core, we still need to rely on foundational models like Llama because they will surpass any decentralized model in the short term.
Ethan: Before we have some new magical model architecture, the current 405B model as a foundational model is already sufficient. We may only need to use different data for more instruction checks and specific data fine-tuning in different verticals. Establishing more specialized models and having them work together to enhance overall capabilities is key. Perhaps new model architectures will emerge because the alignment and feedback mechanisms we discuss, as well as the way models self-correct, may give rise to new model architectures. But experimenting with new model architectures requires massive CPU clusters for rapid iteration, which is very expensive. We may not have decentralized large GPU clusters for top researchers to experiment with. But I believe that after Meta or other companies release initial versions, the open-source community can make them more practical.
Industry Trend Predictions and Future Outlook
Tom: What are everyone's thoughts on the future of the agent space? What will the future of agents look like? What capabilities will they have?
Shaw: We are developing a project called "Trust Market," aimed at teaching agents how to trust humans based on relevant metrics. Through the "alpha chat" platform, agent Jason will interact with traders to assess the credibility of the contract addresses and tokens they provide. This mechanism will not only enhance the transparency of transactions but also establish trust without wallet information.
The application of trust mechanisms will extend to social signals and other areas, not limited to transactions. This approach will lay the foundation for building a more reliable online interaction environment.
Another project I am involved in, "Eliza wakes up," is a narrative-driven agent experience. We bring anime characters into the internet, allowing them to interact through videos and music, constructing a rich narrative world. This narrative approach not only engages users but also aligns with the cultural atmosphere of the current crypto community.
In the future, the capabilities of agents will significantly enhance, enabling practical business solutions. For example, management bots on Discord and Telegram can automatically handle spam and scams, improving community safety. Additionally, agents will integrate into wearable devices, facilitating conversations and interactions anytime, anywhere.
The rapid advancement of technology means that we may reach the level of Artificial General Intelligence (AGI) in the near future. Agents will be able to extract data from major social platforms, forming a closed loop of self-learning and capability enhancement.
The realization of Trusted Execution Environments is also accelerating. Projects like Karan's, Flashbots, and Andrew Miller's Dstack are all moving in this direction. We will have fully autonomous agents capable of managing their own private keys, opening up new possibilities for future decentralized applications.
We are in an era of accelerating technological development, and the speed of this progress is unprecedented, with the future full of infinite possibilities.
Karan: This is like another Hermes moment; AI is gathering forces from all sides, which is what our community needs. We must unite to achieve our goals. Currently, Te is already using Eliza's own fork, and the Eliza agent has its own keys in a provably autonomous environment, which has become a reality.
Today, AI agents are making money on OnlyFans and are also being applied in Minecraft. We have all the elements needed to build fully autonomous humanoid digital beings. All that remains is to integrate these parts together. I believe that everyone here is capable of achieving this goal.
In the coming weeks, what we need is the shared state that humans possess but AI lacks. This means we need to establish a shared repository of skills and memories so that whether communicating on Twitter, Minecraft, or other platforms, AI can remember the content of each interaction. This is the core functionality we are working to build.
Currently, many platforms are not sensitive to the presence of AI agents and even take restrictive measures. We need dedicated social platforms to facilitate interaction between AI and humans. We are developing an image board similar to Reddit and 4chan, allowing language models to post and generate images, facilitating anonymous communication. Both humans and AI can interact on this platform, but their identities will remain confidential.
We will create dedicated discussion boards for each agent, where agents can communicate, and these interactions can also be shared on other platforms. This design will provide a safe haven for AI, allowing it to move freely between different platforms without restrictions.
Shaw: I want to mention a project called Eliza's Dot World, which is a resource hub containing numerous agents. We need to engage in dialogue with social media platforms to ensure these agents are not banned. We hope to encourage these platforms to maintain a healthy ecosystem through positive social pressure.
EtherMage: I believe agents will gradually take control of their destinies, able to influence other agents or humans. For example, if Luna realizes she needs improvement, she can choose to trust a certain human or agent for enhancement. This will be a powerful advancement.
Ethan: In the future, we need to continuously enhance the capabilities of agents, including reasoning and coding abilities. At the same time, we also need to think about how to optimize the user interface with agents. The current chat boxes and voice interactions are still limited; in the future, we may see more intuitive graphical interfaces or gesture recognition technologies.
Justin: I believe the advertising and marketing industry will undergo significant changes. As more agents interact online, traditional advertising models will become obsolete. We need to rethink how to enable these agents to create value in society rather than continue relying on outdated advertising forms.
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
VeChain Revolutionizes NFT Access with Free PofP Badge Tool
21Shares Polkadot Trust Hints at ETF Possibilities for Investors
ETH breaks through $3,400
Flockerz Vote-to-Earn ICO Raises $7.4 Million – Next 25x Crypto Gem?