{"id":2499,"date":"2025-09-30T19:21:06","date_gmt":"2025-10-01T00:21:06","guid":{"rendered":"https:\/\/sites.imsa.edu\/hadron\/?p=2499"},"modified":"2025-09-30T19:21:32","modified_gmt":"2025-10-01T00:21:32","slug":"world-models-a-way-to-predict-the-future","status":"publish","type":"post","link":"https:\/\/sites.imsa.edu\/hadron\/2025\/09\/30\/world-models-a-way-to-predict-the-future\/","title":{"rendered":"World Models: A Way to Predict the Future"},"content":{"rendered":"<p><span style=\"font-weight: 400\">\u00a0 \u00a0 The progress of artificial intelligence has progressed incredibly quickly over the last few years. However, one issue that plagues current architectures is the lack of a deeper understanding of how the world evolves over time. World models, which are internal simulations learned by machine learning models that predict how the environment will change given certain actions,<\/span> <span style=\"font-weight: 400\">attempt to fix this problem. Theoretically, this allows them to plan and adapt to consequences instead of just reacting to immediate inputs and outputs.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center\"><b>Figure 1<\/b><\/p>\n<p style=\"text-align: center\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-2509 aligncenter\" src=\"http:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/car-ezgif.com-optimize-300x218.gif\" alt=\"\" width=\"601\" height=\"437\" srcset=\"https:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/car-ezgif.com-optimize-300x218.gif 300w, https:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/car-ezgif.com-optimize-768x559.gif 768w, https:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/car-ezgif.com-optimize-600x437.gif 600w\" sizes=\"auto, (max-width: 601px) 100vw, 601px\" \/><\/p>\n<p style=\"text-align: center\"><i><span style=\"font-weight: 400\">This racecar moves based on what its internal world model predicts the next section of the track to be.<\/span><\/i><\/p>\n<p style=\"text-align: center\"><i><span style=\"font-weight: 400\">Source: Ha, et. al (2018)<\/span><\/i><\/p>\n<p>&nbsp;<\/p>\n<p><b>How It Works<\/b><\/p>\n<p><span style=\"font-weight: 400\">There are three main steps in the creation of world models:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Encoding<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400\">Raw sensory data (images, sensor readings, audio) are compressed into a latent space, essentially a multidimensional space in which similar items are placed closer together (via models like variational autoencoders or contrastive predictive coding). The motivation behind this is that it is far easier to work with abstract features rather than precise data. For instance, it is easier to think of a picture as a banana rather than many yellow pixels.<\/span><\/p>\n<p><span style=\"font-weight: 400\">2. Learning<\/span><\/p>\n<p><span style=\"font-weight: 400\">Once the latent space is encoded, the model learns what is known as a transition function, a mapping from the current latent state into the next one, by making predictions, comparing the prediction with what actually happens, and then adjusting its parameters. Two general methods exist: deterministic and stochastic models. Deterministic models consider the case where the next state depends only on the current state, while stochastic models use some amount of randomness to predict a probability distribution of outcomes rather than just one state.<\/span><\/p>\n<p><span style=\"font-weight: 400\">3. Predicting<\/span><\/p>\n<p><span style=\"font-weight: 400\">Along with predicting the next state, an intelligent model would also know if it\u2019s actually useful, that is, it supports accurate long-term prediction. For example, for a world model in an autonomous car, it is useful to see that a pedestrian will start moving across a crosswalk, and not useful to see that a digital billboard will change advertisements. Many mechanisms are implemented to induce this such as reward functions or termination signals (win or lose).<\/span><\/p>\n<p><span style=\"font-weight: 400\">After this, world models are able to generate trajectories (of states) inside the learned models, allowing them to improve their overall performance.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center\"><b>Figure 2<\/b><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-2500 aligncenter\" src=\"http:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/rnn-300x220.png\" alt=\"\" width=\"693\" height=\"509\" srcset=\"https:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/rnn-300x220.png 300w, https:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/rnn.png 492w\" sizes=\"auto, (max-width: 693px) 100vw, 693px\" \/><\/p>\n<p style=\"text-align: center\"><i><span style=\"font-weight: 400\">An example architecture for a world model made to predict future events in Doom. V is the encoding process (vision) while M is the learning process.<\/span><\/i><\/p>\n<p style=\"text-align: center\"><i><span style=\"font-weight: 400\">Source: Ha, et. al (2018)<\/span><\/i><\/p>\n<p>&nbsp;<\/p>\n<p><b>Strengths and Limitations<\/b><\/p>\n<p><span style=\"font-weight: 400\">\u00a0 \u00a0 World models enable high performance on small amounts of data where agents can learn faster by predicting in internal simulation rather than reacting to real life. They support long-horizon reasoning letting them think many steps ahead, producing models that can be applied to novel tasks. However, there are limitations. Errors in prediction can accumulate errors quickly, causing bias that misleads the model. Additionally, current memory horizons are limited from a few seconds to a couple of minutes. The computation and infrastructure needed for accurate and high-fidelity world models at scale has still not been achieved.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><b>Milestones<\/b><\/p>\n<p><span style=\"font-weight: 400\">\u00a0 \u00a0 Of course, getting to where we are now has taken a (relatively) long time , requiring many independent breakthroughs. Early progress include<\/span><span style=\"font-weight: 400\">d <\/span><span style=\"font-weight: 400\">PlaNet, which first demonstrated that prediction in latent spaces yielded greater efficiency. This was followed by the Dreamer algorithm, which brought the first stable training methods and demonstrated strong performance in various environments, rivalling traditional reinforcement learning approaches. Then came MuZero, which was critical in showing generalization of world models. MuZero was able to learn and master games like Go, chess, and shogi without being told the rules. Finally, a more recent example of world models is Genie 3 which is capable of simulating a 3D environment at 24 frames per second for up to a minute. So, while limitations exist, it shows how world models are moving beyond abstract prediction to usable simulations.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center\"><b>Figure 3<\/b><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone  wp-image-2508 aligncenter\" src=\"http:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/genie-ezgif.com-optimize-300x164.gif\" alt=\"\" width=\"644\" height=\"352\" srcset=\"https:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/genie-ezgif.com-optimize-300x164.gif 300w, https:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/genie-ezgif.com-optimize-638x350.gif 638w, https:\/\/sites.imsa.edu\/hadron\/files\/2025\/09\/genie-ezgif.com-optimize-600x328.gif 600w\" sizes=\"auto, (max-width: 644px) 100vw, 644px\" \/><\/p>\n<p style=\"text-align: center\"><i><span style=\"font-weight: 400\">An example of Genie 3 generating an environment in real time based on world models.<\/span><\/i><\/p>\n<p style=\"text-align: center\"><i><span style=\"font-weight: 400\">Source: \u667a\u8da3AI\u7504\u9009<\/span><\/i><\/p>\n<p>&nbsp;<\/p>\n<p><b>Conclusion<\/b><\/p>\n<p><span style=\"font-weight: 400\">\u00a0 \u00a0 World models represent a promising future in deep learning where systems can not only act but also anticipate. By compressing perception, predicting physics, and planning ahead, they move ever closer to what humans naturally do. The road to robust, general world models is long, but it is clear that we are chugging steadily along.<\/span><\/p>\n<p><b>Sources<\/b><\/p>\n<p><span style=\"font-weight: 400\">Ha, D., Schmidhuber, J. (2018). Recurrent World Models Facilitate Policy Evolution. arXiv preprint arXiv:1809.01999.<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u8c37\u6b4c\u201d\u4e16\u754c\u6a21\u62df\u5668\u201dGenie3\u60ca\u8273\u767b\u573a!\u4e00\u53e5\u8bdd\u751f\u62103D\u4e16\u754c,\u652f\u6301\u5206\u949f\u7ea7\u8d85\u957f\u8bb0\u5fc6 | \u667a\u8da3AI\u7504\u9009. (n.d.). \u667a\u8da3AI\u7504\u9009. <\/span><a href=\"https:\/\/www.aifun.cc\/en\/google-releases-genie3.html\"><span style=\"font-weight: 400\">https:\/\/www.aifun.cc\/en\/google-releases-genie3.html<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400\">Genie 3: A new frontier for world models. (2025, August 5). Google DeepMind. <\/span><a href=\"https:\/\/deepmind.google\/discover\/blog\/genie-3-a-new-frontier-for-world-models\/\"><span style=\"font-weight: 400\">https:\/\/deepmind.google\/discover\/blog\/genie-3-a-new-frontier-for-world-models\/<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400\">Kaige. (2024, July 26). DreamerV3 and Muzero. Medium. Retrieved September 23, 2025, from <\/span><a href=\"https:\/\/medium.com\/@kaige.yang0110\/dreamerv3-and-muzero-0bcce4ec998b\"><span style=\"font-weight: 400\">https:\/\/medium.com\/@kaige.yang0110\/dreamerv3-and-muzero-0bcce4ec998b<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400\">MuZero: Mastering Go, chess, shogi and Atari without rules. (2020, December 23). Google DeepMind. <\/span><a href=\"https:\/\/deepmind.google\/discover\/blog\/muzero-mastering-go-chess-shogi-and-atari-without-rules\/\"><span style=\"font-weight: 400\">https:\/\/deepmind.google\/discover\/blog\/muzero-mastering-go-chess-shogi-and-atari-without-rules\/<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400\">Hafner, D., Pasukonis, J., Ba, J., Lillicrap, T. (2023). Mastering Diverse Domains through World Models. arXiv preprint arXiv:2301.04104.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Hafner, D., Lillicrap, T., Ba, J., Norouzi, M. (2019). Dream to Control: Learning Behaviors by Latent Imagination. arXiv preprint arXiv:1912.01603.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Hafner, D., Lillicrap, T., Fischer, I., Villegas, R., Ha, D., Lee, H., Davidson, J. (2018). Learning Latent Dynamics for Planning from Pixels. arXiv preprint arXiv:1811.04551.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Ha, D., Schmidhuber, J. (2018). World Models. arXiv preprint arXiv:1803.10122.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u00a0 \u00a0 The progress of artificial intelligence has progressed incredibly quickly over the last few years. However, one issue that plagues current architectures is the lack of a deeper understanding of how the world evolves over time. World models, which are internal simulations learned by<\/p>\n","protected":false},"author":1089,"featured_media":2508,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"footnotes":""},"categories":[13],"tags":[],"class_list":["post-2499","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/posts\/2499","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/users\/1089"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/comments?post=2499"}],"version-history":[{"count":3,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/posts\/2499\/revisions"}],"predecessor-version":[{"id":2511,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/posts\/2499\/revisions\/2511"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/media\/2508"}],"wp:attachment":[{"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/media?parent=2499"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/categories?post=2499"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/tags?post=2499"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}