Intelligent Immersion: The current landscape of AI tools in the Mixed Reality development pipeline for creative experiences

Resumen

Among the technologies that hold immense potential to revolutionize how we interact with information and digital content, Mixed Reality (MR) offers unique immersive experiences that seamlessly integrate virtual objects into the user's physical environment. This groundbreaking fusion of the physical and digital worlds has a complex development process which is a fertile ground for applications of Artificial Intelligence (AI). This article aims to provide a comprehensive overview of AI tools and their applications, in all stages of the development of MR experiences for the creative sector. It also discusses the challenges and opportunities of incorporating them in the development pipeline and offer some use guidelines as a compass to navigate this rapidly changing landscape.

Descargas

La descarga de datos todavía no está disponible.

Biografía del autor/a

Mariza Dima, Brunel University London

Mariza Dima is an interaction designer, podcaster, and Associate Professor in Games Design with a focus on Creative Technology for Games at Brunel University London.
Equipped with an interdisciplinary background in applied mathematics, Human-Computer Interaction design (HCI), and media and games studies, she explores, by design, how to create meaningful, engaging, seamless experiences with digital technologies in different contexts. She has worked with physical, mobile, haptic and Mixed Reality interfaces to serve interactions in performance art, cultural heritage and social innovation with social justice as a drive.

She has expertise in brokering knowledge between the industry and the academia and has led several projects in partnership with prestigious theatrical companies, museums, cultural organisations, SMEs, and visual arts institutions.

She is also an industry consultant on strategies for devising and developing digital projects and user interactions and have expertise in design methods for collaboration and co-creation.

Damon Daylamani-Zad, Brunel University London

Damon Daylamani-Zad is a Senior Lecturer in AI and Games in the College of Engineering, Design and Physical Science at Brunel University London. He is a Fellow of the British Computing Society and holds a BSc in Software Engineering from University of Tehran, an MSc in Multimedia Computing and a PhD in Electronic and Computer Engineering both from Brunel University London where he has also been an EPSRC Research Fellow. Damon's research interests focus on applications of Artificial Intelligence and Machine Learning, Collaborative Games, Serious Gaming, and User Modelling and Personalisation as well as application of Evolutionary algorithms in Creative Computing. He has published his research findings widely in journals, edited books and presented his work at several conferences including those hosted by the IEEE.

Citas

Ali, O., Abdelbaki, W., Shrestha, A., Elbasi, E., Abdallah, M., Alryalat, A., & Dwivedi, K.Y. (2023). A systematic literature review of artificial intelligence in the healthcare sector: Benefits, challenges, methodologies, and functionalities. Journal of Innovation & Knowledge, 8(1).

Anantrasirichai, N., & Bull, D. (2022). Artificial intelligence in the creative industries: a review. Artificial Intelligence Review, 55, 589–656. https://doi.org/10.1007/s10462-021-10039-7

Anantrasirichai, N., Gilchrist, I.D., & Bull, D. (2016). Fixation identification for low-sample-rate mobile eye trackers. In: IEEE international conference on image processing (ICIP) (pp. 3126–3130). https://doi.org/10.1109/ICIP.2016.7532935

Bassyouni, Z., & Elhajj, I. (2021). Augmented Reality Meets Artificial Intelligence in Robotics: A Systematic Review. Frontiers in Robotics and AI, 8, 724798. https://doi.org/10.3389/frobt.2021.724798

Bharadiya, J., & Thomas, R., & Ahmed, F. (2023). Rise of Artificial Intelligence in Business and Industry. Journal of Engineering Research and Reports, 25, 100807. https://doi.org/10.9734/JERR/2023/v25i3893

Bissonnette, V., Mirchi, N., Ledwos, N., Alsidieri, G., Winkler-Schwartz, A., & Del Maestro, R. F. (2019). Artificial Intelligence Distinguishes Surgical Training Levels in a Virtual Reality Spinal Task. The Journal of Bone & Joint Surgery, 101(23), e127. https://doi.org/10.2106/JBJS.18.01197

Blattmann, A., Dockhorn, T., Kulal, S., Mendelevitch, D., Kilian M., & Lorenz, D. (2023). Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets. ArXiv abs/2311.15127. https://doi.org/10.48550/arXiv.2311.15127

Daylamani Zad, D., Araabi, B. N., & Lucas, C. (2006). A novel approach to automatic music composing: Using genetic algorithm. In International Computer Music Conference Proceedings. MPublishing, University of Michigan Library, Vol. 2006, 551-555.

Dhariwal, P., Jun, H., Payne, C., Kim, J.W., Radford, A., & Sutskever, I. (2020). Jukebox: A Generative Model for Music. ArXiv, abs/2005.00341. https://doi.org/10.48550/arXiv.2005.00341

Engel, J., Resnick, C., Roberts, A., Dieleman, S., Norouzi, M., Eck, D., & Simonyan, K. (2017). Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders. ArXiv, abs/1704.01279. https://doi.org/10.48550/arXiv.1704.01279

Gafni, G., Thies, J., Zollhofer, M., and Nießner, M. (2021). Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8649–8658.

Hensen, B. (2023). A Systematic Literature Review of Mixed Reality Learning Approaches. In: De Paolis, L.T., Arpaia, P., Sacco, M. (eds) Extended Reality. XR Salento 2023. Lecture Notes in Computer Science, vol 14219. Springer, Cham. https://doi.org/10.1007/978-3-031-43404-4_2

Hong, W., Ding, M., Zheng, W., Liu, X. and Tang, J. (2022). CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers. ArXiv:2205.15868. https://doi.org/10.48550/arXiv.2205.15868

Hu, Z., Li, S., Zhang, C., Yi, K., Wang, G., & Manocha, D. (2020). Dgaze: Cnn-based gaze prediction in dynamic scenes. IEEE transactions on visualization and computer graphics, 26(5), 1902-1911.

Huynh-The, T., Hua, C. H., Tu, N. A., & Kim, D. S. (2020). Learning 3D spatiotemporal gait feature by convolutional network for person identification. Neurocomputing, 397, 192-202.

Israelsen, B., Ahmed, N., Center, K., Green, R., and Bennett, W. (2018). Adaptive Simulation-Based Training of Artificial-Intelligence Decision Makers Using Bayesian Optimization. Journal of Aerospace Information Systems, 15, 38–56. https://doi.org/10.2514/1.I010553

Jiang, Y., Hieu Tran, T., & Williams, L. (2023). Machine learning and mixed reality for smart aviation: Applications and challenges. Journal of Air Transport Management, 111, 102437. https://doi.org/10.1016/j.jairtraman.2023.102437

Khatri, P. (2021). The Future of Automatically Generated Animation with AI, In Deep Learning in Gaming and Animations. CRC Press.

Lai, P. K., Xie, S., Lang, J., & Laganière, R. (2019). Real-time panoramic depth maps from omni-directional stereo images for 6 dof videos in virtual reality. In IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 405-412.

LeCun Y., Bengio Y., Hinton G. (2015). Deep learning. Nature, 521(7553), 436-444.

Liu Y., Rai R., Purwar A., He B., Mani M. (2020). Machine learning applications in manufacturing. Journal of Computing and Information Science in Engineering, 20(2), Article 020301.

Ming Tang, Y., Ting Kuo, W., Lee, C.K.M. (2023). Real-time Mixed Reality (MR) and Artificial Intelligence (AI) object recognition integration for digital twin in Industry 4.0. Internet of Things, 23, 100753. https://doi.org/10.1016/j.iot.2023.100753

Ojeda-Castelo, J. J., Capobianco-Uriarte, M. D. L. M., Piedra-Fernandez, J. A., & Ayala, R. (2022). A survey on intelligent gesture recognition techniques. IEEE Access, 10, 87135-87156.

Pine, J., & Gilmore, J. H. (2011). The experience economy. Harvard Business Press.

Poole, B., Jain, A., Barron, T.J., & Mildenhall, B. (2022). Dreamfusion: Text-to-3d using 2d diffusion. arXiv:2209.14988. https://doi.org/10.48550/arXiv.2209.14988

Rathore B. (2023). Digital Transformation 4.0: Integration of Artificial Intelligence & Metaverse in Marketing. Eduzone: International Peer Reviewed/Refereed Multidisciplinary Journal, 12(1), 42–48.

Reiners, D., Davahli, M. Reza, Karwowski, W., & Cruz-Neira, C. (2021). The Combination of Artificial Intelligence and Extended Reality: A Systematic Review. Frontiers in Virtual Reality. 2. 721933. https://doi.org/10.3389/frvir.2021.721933

Rüßmann, M., Lorenz, M., Gerbert, P., Waldner, M. Justus, J., Engel, P., & Harnisch, M. (2015). Industry 4.0: The future of productivity and growth in manufacturing industries. Boston consulting group, 9, 1, 54-89.

Soccini AM (2017). Gaze estimation based on head movements in virtual reality applications using deep learning. In: IEEE virtual reality (VR), 413–414.

Starke, S., Zhao, Y., Zinno, F., & Komura, T., 2021. Neural animation layering for synthesizing martial arts movements. ACM Transactions on Graphics (TOG), 40(4), 1-16.

Uriel, S., Polyak, A., Hayes, T., Yin, X., An, J., Zhang, S., Hu, Q., Yang, H., Ashual, O., Gafni, O., Parikh, D., Gupta, S., & Taigman, Y. (2022). Make-A-Video: Text-to-Video Generation without Text-Video Data. ArXiv abs/2209.14792. https://doi.org/10.48550/arXiv.2209.14792

Wang, Z., Lu, C., Wang, Y., Bao, F., Li, C., Su, H., and Zhu, J. (2023). Prolific dreamer: High fidelity and diverse text-to-3d generation with variational score distillation. arXiv:2305.16213. https://doi.org/10.48550/arXiv.2305.16213

Wu, J.Z., J., Ge, Y., Wang, X., Lei, W.S., Gu, Y., Shi, Y., Hsu, W., Shan, Y., Qie, X., & Shou, Z.M. (2023). Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7623-7633.

Yeh, C. H., Huang, C. H., & Kang, L. W. (2019). Multi-scale deep residual learning-based single image haze removal via image decomposition. IEEE Transactions on Image Processing, 29, 3153-3167.

Yin, Z., Reuben, F., Stepney, S., & Collins, T. (2023). Deep learning’s shallow gains: A comparative evaluation of algorithms for automatic music generation. Machine Learning, 112(5), 1785-1822.

Zawacki-Richter, O., Marín, V.I., and Bond, M. (2019). Systematic review of research on artificial intelligence applications in higher education – where are the educators? International Journal of Educational Technology in Higher Education, 39. https://doi.org/10.1186/s41239-019-0171-0

Zhang, H., Starke, S., Komura, T., & Saito, J. (2018). Mode-adaptive neural networks for quadruped motion control. ACM Transactions on Graphics (TOG), 37(4), 1-11.

Zhu, Y., Zhao, C., Guo, H., Wang, J., Zhao, X., & Lu, H. (2018). Attention CoupleNet: Fully convolutional attention coupling network for object detection. IEEE Transactions on Image Processing, 28(1), 113-126.

Publicado
31-07-2024
Cómo citar
Dima, M., & Daylamani-Zad, D. (2024). Intelligent Immersion: The current landscape of AI tools in the Mixed Reality development pipeline for creative experiences . Revista De La Asociación Española De Investigación De La Comunicación, 11(Especial), raeic11e01. https://doi.org/10.24137/raeic.11.e.1