In the dynamic landscape of artificial intelligence, the pursuit of seamless integration between humans and technology stands as a paramount goal. The ambition is to render interactions so natural that the utilization of cutting-edge technologies becomes second nature. To explore what the industry is heading towards here, I talked to one of the speakers at our Epic AI Dev Summit, Or Gorodissky, the Vice President of Research and Development at the D-ID company, the summit’s co-organizer. Or is an expert in Natural User Interface (NUI) technologies and has been developing Generative AI at D-ID since 2018.
Alex: What is the core vision behind the development of Natural User Interface (NUI), and how does it contribute to the broader landscape of AI agents?
Or: The vision behind the development of the Natural User Interface (NUI) is to revolutionize the way people interact with technology. NUI represents a significant leap from the previous interfaces, most notably GUI (Graphical User Interface), emphasizing natural, face-to-face conversations with digital entities. Our goal is to do away with the mouse and keyboard and replace them with an interface that allows you to “speak” with your devices directly, face-to-face, as you would with another human being. This approach humanizes digital interactions, making them more accessible, intuitive, and inclusive. It effectively bridges the gap between human and digital realms, enhancing user engagement and satisfaction across a wide range of business sectors.
Alex: What future advancements in AI and video generation are you most excited about, and how do you foresee the industry preparing for these upcoming changes?Or: The most exciting future advancements in AI and video generation relate to the creation of more immersive, human, and engaging interfaces. With technologies like Apple’s Persona avatar in its VisionPro, D-ID’s real-time interactive Agents, and Runway’s text-to-video generator, the industry is moving towards a more interactive and lifelike mode of communication. This evolution will likely see all companies leveraging these generative AI products to enhance customer interaction. I think that preparing for these changes involves staying updated with technological developments, investing in R&D, and ensuring that these new tools are accessible and adaptable to multiple business needs.
Alex: What are the obstacles faced in creating AI-generated video content, and potential solutions that can be applied universally?
Or: Creating high-quality videos using AI is still considered a difficult task. Not all of the problems have been solved and developing solutions can take time. Many companies grapple with producing videos that are not only temporally consistent and high-resolution but also created with low latency or high throughput, all while keeping computational costs in check.
It’s a challenge to steer a company in a way that ensures that technical and product roadmaps both innovate and deliver impactful products. To overcome this, we are focusing on cycles of innovation and improvement, prioritizing impactful efforts and strategically building towards future capabilities. Emphasizing user-centric design and leveraging existing solutions for non-core aspects help streamline the process.
Alex: Integrating AI technologies into existing systems and platforms is often complex. How does D-ID’s technology integrate with existing systems and platforms, and what are the challenges in these integrations?
Or: D-ID’s technology integrates with existing systems and platforms through its advanced API, designed to be flexible and user-friendly. This API allows for seamless integration of our AI capabilities, enabling businesses to personalize their AI experiences and align them with specific needs and audiences. The main challenge in these integrations, we believe, is ensuring compatibility and maintaining the balance between technological sophistication and user experience. Our approach focuses on making these integrations as intuitive and straightforward as possible, providing tools and solutions that tailor our capabilities to each user’s unique requirements.
Alex: Staying ahead in the rapidly advancing field of AI is crucial. What general strategies should companies employ to remain at the forefront of AI technology?
Or: Well, it’s risky to rely solely on technical superiority as everything you build will eventually become a commodity. It may take some time, years if you’re lucky, but you won’t get a lot of sleep if every time a new research paper comes out you’ll question your business strategy.
Instead, try to be laser-focused and user-centric. Double down on the things that bring value and leverage existing solutions when the value doesn’t justify the effort. Technology in and of itself is not a silver bullet. Make sure that both the product and business aspects are constantly addressed to ensure your effort is most effective.
You want your users to stay with you even when the next big open-source solution comes out. Think about that when you choose where to invest your focus.
Alex: Ethical considerations are crucial in AI development. How do you believe the industry should ensure ethical practices in the creation and deployment of AI technologies?
Or: Yes, of course, ethical practices must be a constant touchstone for AI developers. This means ensuring a commitment to transparency, respect for privacy, and adherence to ethical standards. I believe, companies should work closely with privacy experts and ethicists to establish and follow strict guidelines. Regular audits and moderation, along with collaborations with regulatory bodies, can ensure responsible AI development. Additionally, the implementation of tracking systems, watermarks, and content moderation tools can help mitigate misuse. It’s crucial for industry leaders to lead by example, creating a culture of ethical AI use that balances innovation with responsibility and public trust.
Alex: Could you share a memorable success story or a particularly innovative use case of D-ID’s technology in action?
Radio Fórmula, a renowned media entity in Mexico’s Grupo Fórmula network, leveraged D-ID’s technology to create AI-generated newscasters, revolutionizing their news broadcasting approach. This collaboration led to a notable surge in engagement from younger audiences, demonstrating the impactful fusion of traditional media with advanced AI technology. For a detailed exploration of this innovative venture, you can read the full case study on D-ID’s website: Radio Fórmula and D-ID Case Study.
On January 30, 2024, Or will share more of his insights about NUI at our Epic AI Dev Summit, presenting his talk “Crafting AI agents with a natural user interface”. Full agenda and registration here!