Intelligent Immersion: The current landscape of AI tools in the Mixed Reality development pipeline for creative experiences
Resumen
Among the technologies that hold immense potential to revolutionize how we interact with information and digital content, Mixed Reality (MR) offers unique immersive experiences that seamlessly integrate virtual objects into the user's physical environment. This groundbreaking fusion of the physical and digital worlds has a complex development process which is a fertile ground for applications of Artificial Intelligence (AI). This article aims to provide a comprehensive overview of AI tools and their applications, in all stages of the development of MR experiences for the creative sector. It also discusses the challenges and opportunities of incorporating them in the development pipeline and offer some use guidelines as a compass to navigate this rapidly changing landscape.
Descargas
Citas
Ali, O., Abdelbaki, W., Shrestha, A., Elbasi, E., Abdallah, M., Alryalat, A., & Dwivedi, K.Y. (2023). A systematic literature review of artificial intelligence in the healthcare sector: Benefits, challenges, methodologies, and functionalities. Journal of Innovation & Knowledge, 8(1).
Anantrasirichai, N., & Bull, D. (2022). Artificial intelligence in the creative industries: a review. Artificial Intelligence Review, 55, 589–656. https://doi.org/10.1007/s10462-021-10039-7
Anantrasirichai, N., Gilchrist, I.D., & Bull, D. (2016). Fixation identification for low-sample-rate mobile eye trackers. In: IEEE international conference on image processing (ICIP) (pp. 3126–3130). https://doi.org/10.1109/ICIP.2016.7532935
Bassyouni, Z., & Elhajj, I. (2021). Augmented Reality Meets Artificial Intelligence in Robotics: A Systematic Review. Frontiers in Robotics and AI, 8, 724798. https://doi.org/10.3389/frobt.2021.724798
Bharadiya, J., & Thomas, R., & Ahmed, F. (2023). Rise of Artificial Intelligence in Business and Industry. Journal of Engineering Research and Reports, 25, 100807. https://doi.org/10.9734/JERR/2023/v25i3893
Bissonnette, V., Mirchi, N., Ledwos, N., Alsidieri, G., Winkler-Schwartz, A., & Del Maestro, R. F. (2019). Artificial Intelligence Distinguishes Surgical Training Levels in a Virtual Reality Spinal Task. The Journal of Bone & Joint Surgery, 101(23), e127. https://doi.org/10.2106/JBJS.18.01197
Blattmann, A., Dockhorn, T., Kulal, S., Mendelevitch, D., Kilian M., & Lorenz, D. (2023). Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets. ArXiv abs/2311.15127. https://doi.org/10.48550/arXiv.2311.15127
Daylamani Zad, D., Araabi, B. N., & Lucas, C. (2006). A novel approach to automatic music composing: Using genetic algorithm. In International Computer Music Conference Proceedings. MPublishing, University of Michigan Library, Vol. 2006, 551-555.
Dhariwal, P., Jun, H., Payne, C., Kim, J.W., Radford, A., & Sutskever, I. (2020). Jukebox: A Generative Model for Music. ArXiv, abs/2005.00341. https://doi.org/10.48550/arXiv.2005.00341
Engel, J., Resnick, C., Roberts, A., Dieleman, S., Norouzi, M., Eck, D., & Simonyan, K. (2017). Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders. ArXiv, abs/1704.01279. https://doi.org/10.48550/arXiv.1704.01279
Gafni, G., Thies, J., Zollhofer, M., and Nießner, M. (2021). Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8649–8658.
Hensen, B. (2023). A Systematic Literature Review of Mixed Reality Learning Approaches. In: De Paolis, L.T., Arpaia, P., Sacco, M. (eds) Extended Reality. XR Salento 2023. Lecture Notes in Computer Science, vol 14219. Springer, Cham. https://doi.org/10.1007/978-3-031-43404-4_2
Hong, W., Ding, M., Zheng, W., Liu, X. and Tang, J. (2022). CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers. ArXiv:2205.15868. https://doi.org/10.48550/arXiv.2205.15868
Hu, Z., Li, S., Zhang, C., Yi, K., Wang, G., & Manocha, D. (2020). Dgaze: Cnn-based gaze prediction in dynamic scenes. IEEE transactions on visualization and computer graphics, 26(5), 1902-1911.
Huynh-The, T., Hua, C. H., Tu, N. A., & Kim, D. S. (2020). Learning 3D spatiotemporal gait feature by convolutional network for person identification. Neurocomputing, 397, 192-202.
Israelsen, B., Ahmed, N., Center, K., Green, R., and Bennett, W. (2018). Adaptive Simulation-Based Training of Artificial-Intelligence Decision Makers Using Bayesian Optimization. Journal of Aerospace Information Systems, 15, 38–56. https://doi.org/10.2514/1.I010553
Jiang, Y., Hieu Tran, T., & Williams, L. (2023). Machine learning and mixed reality for smart aviation: Applications and challenges. Journal of Air Transport Management, 111, 102437. https://doi.org/10.1016/j.jairtraman.2023.102437
Khatri, P. (2021). The Future of Automatically Generated Animation with AI, In Deep Learning in Gaming and Animations. CRC Press.
Lai, P. K., Xie, S., Lang, J., & Laganière, R. (2019). Real-time panoramic depth maps from omni-directional stereo images for 6 dof videos in virtual reality. In IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 405-412.
LeCun Y., Bengio Y., Hinton G. (2015). Deep learning. Nature, 521(7553), 436-444.
Liu Y., Rai R., Purwar A., He B., Mani M. (2020). Machine learning applications in manufacturing. Journal of Computing and Information Science in Engineering, 20(2), Article 020301.
Ming Tang, Y., Ting Kuo, W., Lee, C.K.M. (2023). Real-time Mixed Reality (MR) and Artificial Intelligence (AI) object recognition integration for digital twin in Industry 4.0. Internet of Things, 23, 100753. https://doi.org/10.1016/j.iot.2023.100753
Ojeda-Castelo, J. J., Capobianco-Uriarte, M. D. L. M., Piedra-Fernandez, J. A., & Ayala, R. (2022). A survey on intelligent gesture recognition techniques. IEEE Access, 10, 87135-87156.
Pine, J., & Gilmore, J. H. (2011). The experience economy. Harvard Business Press.
Poole, B., Jain, A., Barron, T.J., & Mildenhall, B. (2022). Dreamfusion: Text-to-3d using 2d diffusion. arXiv:2209.14988. https://doi.org/10.48550/arXiv.2209.14988
Rathore B. (2023). Digital Transformation 4.0: Integration of Artificial Intelligence & Metaverse in Marketing. Eduzone: International Peer Reviewed/Refereed Multidisciplinary Journal, 12(1), 42–48.
Reiners, D., Davahli, M. Reza, Karwowski, W., & Cruz-Neira, C. (2021). The Combination of Artificial Intelligence and Extended Reality: A Systematic Review. Frontiers in Virtual Reality. 2. 721933. https://doi.org/10.3389/frvir.2021.721933
Rüßmann, M., Lorenz, M., Gerbert, P., Waldner, M. Justus, J., Engel, P., & Harnisch, M. (2015). Industry 4.0: The future of productivity and growth in manufacturing industries. Boston consulting group, 9, 1, 54-89.
Soccini AM (2017). Gaze estimation based on head movements in virtual reality applications using deep learning. In: IEEE virtual reality (VR), 413–414.
Starke, S., Zhao, Y., Zinno, F., & Komura, T., 2021. Neural animation layering for synthesizing martial arts movements. ACM Transactions on Graphics (TOG), 40(4), 1-16.
Uriel, S., Polyak, A., Hayes, T., Yin, X., An, J., Zhang, S., Hu, Q., Yang, H., Ashual, O., Gafni, O., Parikh, D., Gupta, S., & Taigman, Y. (2022). Make-A-Video: Text-to-Video Generation without Text-Video Data. ArXiv abs/2209.14792. https://doi.org/10.48550/arXiv.2209.14792
Wang, Z., Lu, C., Wang, Y., Bao, F., Li, C., Su, H., and Zhu, J. (2023). Prolific dreamer: High fidelity and diverse text-to-3d generation with variational score distillation. arXiv:2305.16213. https://doi.org/10.48550/arXiv.2305.16213
Wu, J.Z., J., Ge, Y., Wang, X., Lei, W.S., Gu, Y., Shi, Y., Hsu, W., Shan, Y., Qie, X., & Shou, Z.M. (2023). Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7623-7633.
Yeh, C. H., Huang, C. H., & Kang, L. W. (2019). Multi-scale deep residual learning-based single image haze removal via image decomposition. IEEE Transactions on Image Processing, 29, 3153-3167.
Yin, Z., Reuben, F., Stepney, S., & Collins, T. (2023). Deep learning’s shallow gains: A comparative evaluation of algorithms for automatic music generation. Machine Learning, 112(5), 1785-1822.
Zawacki-Richter, O., Marín, V.I., and Bond, M. (2019). Systematic review of research on artificial intelligence applications in higher education – where are the educators? International Journal of Educational Technology in Higher Education, 39. https://doi.org/10.1186/s41239-019-0171-0
Zhang, H., Starke, S., Komura, T., & Saito, J. (2018). Mode-adaptive neural networks for quadruped motion control. ACM Transactions on Graphics (TOG), 37(4), 1-11.
Zhu, Y., Zhao, C., Guo, H., Wang, J., Zhao, X., & Lu, H. (2018). Attention CoupleNet: Fully convolutional attention coupling network for object detection. IEEE Transactions on Image Processing, 28(1), 113-126.