Speech by UNRE CEO Dr. Sun at the Spring Developer Conference
2019-06-09 19:21:32

In this Spring of Rockchip Developers' Conference, the theme of "Wisdom for the Future" attracted nearly 1000 developers and industry leaders, and many well-known companies with global influence attended the event. Dr. Sun, founder of UNRE AI Limited, is a world-renowned expert in the field of 3D intelligent vision and a successful successive entrepreneur of famous Silicon Valley high-tech enterprises such as Weitek, C-Cube and DVS. He was invited to give a wonderful speech on the theme of "Connecting People with the World in 3D".

Following is Dr. Sun's wonderful speech on the theme of "Connecting People with the World in 3D":
The three-dimensional world is the real world, and the current shooting and display technology restricts us to a flat world.

With the advent of low-cost 3D cameras for smartphones, we are interacting with the world in a 3D way. The ecosystem of the 3D industry is being built, and we are facing another 10-year development opportunity.

For humans, we can "see" things in our surroundings through our eyes, and we can also "recognize and understand these things, and form "cognition and decision-making" about these things. More and more researchers in the field of AI are trying to make AI do the same. Starting from human 3D vision, 3D intelligent vision has become a hot research and application field of artificial intelligence.

With the further application of intelligent vision, intelligent vision technology has entered the stage of "3D", that is, not only to see something, to know something, but also to enter the scene, to truly feel the scene in the three-dimensional world, which requires the intelligence with 3D spatial percept on and cognitive ability, that is, 3D intelligent vision.

3D intelligent vision is the integration of many disciplines, including computer vision, computer graphics and other fields of technology and in-depth learning, large data cross-fusion.

In summary, there are several directions for 3D intelligent vision: first, 3D perception, that is, perceiving 3D space, acquiring and processing 3D depth; second, location perception, such as perceiving the location of the camera; third, 3D imaging and modeling, which not only has basic depth perception of the scene, but also obtains the description of the complete geometric model of the scene; and finally, 3D. Understanding, understanding the scene and the objects in it from 3D space.

The development of 3D intelligent vision benefits from the development of vision sensors, and sensors can be roughly divided into two categories, one is passive sensors, for example, the various cameras we use now are passive sensors. The other is the active sensor, which takes the active transmitting signal as the dominant sensor for measurement.

For 3D imaging, low-cost 3Dimaging solutions include time-of-flight (TOF), structured light, stereo vision and light field. For users, the difference lies in imaging accuracy, speed, resolution, frame rate and environmental dependence.

For 3D modeling, especially for 3D portrait modeling and analysis processing, the cost of 3D portrait modeling and analysis processing has been greatly reduced due to the emergence of low-cost consumer-level RGB-D sensors. In the past, the classical method of 3D portrait reconstruction of single image is 3DMM, which is a statistical method. It collects many face models, uses PCA to reduce dimension to get statistical model, and then fits the statistical model to the face to be solved. Nowadays, in-depth learning methods also use the same core ideas. Texture reconstruction of multiple images is a natural extension based on single image task.

Then we talk about 3D portrait analysis. The processing flow can be divided into data acquisition, pre-processing (removing cusps, filling holes, etc.), shape representation, measurement and matching. The application scenarios of 3D portrait analysis include authentication, expression analysis, aesthetic analysis, etc. The typical application is the iPhone FaceID, which collects 3D data of human figures for recording and comparing. The emergence of FaceID shows that 3D portraits can already be used in some customized products. 3D portrait processing technology has many application scenarios in the fields of face recognition, beauty industry, new retail virtual trial wear and so on, which has enabled industrial upgrading.


At the same time, with the popularity of 3D sensors, more and more 3D data, how to achieve a detailed understanding of the 3D scene becomes very important. One of the most important and effective ways to understand scenes is in-depth learning. The earliest in-depth learning is for 2D images. Convolutional neural networks are oriented to 2D images. For 3D scenes, the input data are 3D point clouds. In the past, no convolutional neural networks can naturally process unstructured 3D point clouds. With the development of artificial intelligence technology, the convolutional neural networks can naturally process unstructured 3D point clouds. As a result, the fine understanding of 3D scenes becomes possible.

Three years later, the market will break through 10 billion US dollars. Six years later, it will break through 25 billion US dollars. The driving force of the explosive growth of the market comes from the maturity of 5G network, the upgrade of AI technology, the rise of multiple terminals, the popularization of smartphones supporting 3D cameras and the continuous expansion of 3D applications.

The basic components of the 3D industry include shooting equipment, analytical processing tools and semantic tools for vertical applications, and display devices. Among them, the analysis and processing tools and semantic tools for vertical applications can also achieve some functions by using 2D, but the image information obtained by 2D imaging has the loss of feature information; 3D imaging is not only for photography, but also for depth information.

Reconstructing the real world in order to serve the follow-up human-computer interaction can only be the responsibility of 3D technology. It is the need for interaction that generates 3D imaging. The rapid development of computing engine (including VLSI visual processor, DSP, NPU, GPU, FPGA), algorithm (including computer vision, artificial intelligence image processing, computer photography, computer graphics) and photoelectric technology (photoelectric technology for photography and display) has led to the rapid explosion of the 3D industry.

The demand-driven effect of industrial upgrading further enlarges the market scale of 3D industry, including 4D video communication in 5G era, trial-wear of new retail virtual 3D products, measurement and analysis of 3D portraits and bodies of Beauty precise 3D modeling and measurement of construction industry and manufacturing industry, etc.

Since its inception in 2017, UNRE Technology has completed a full stack UNRE AIO 3D intelligent vision development platform, which consists of a single product line of 3D intelligent vision algorithm engine, a front-end UNRE U8090/U8091 3D Camera, a deep integration of UNRE 3D Senz intelligent vision algorithm engine and RK3399Pro AI Chip. Model 3D camera, front 3D structured light, rear 3D TOF.


As a representative enterprise of 3D intelligent vision, UNRE Technologies has been maintaining advanced production in many fields such as 3D image processing and image compression, 3D object measurement, 4D video live broadcasting and TOF-SLAM full stack solution, new retail integrated machine, entertainment, medical, security monitoring, industrial control, VR and so on. Product layout, constantly enhance the influence and product competitiveness in the field of 3D vision, and maintain in-depth cooperation with international giants such as Rockchip, Infineon, Intel, etc., gradually have a higher degree of dominance and voice in the global 3D intelligent vision technology industry chain.

At present, 3D applications have reached the explosion point of high-speed explosion, low-cost 3D cameras, 5G networks, low-cost computing and holographic display and other basic technologies and equipment have been ready to be applied in Beauty, big entertainment, new retail, instant messaging and other industries, the time is ripe. Angry technology is one of the pioneers in the field of 3D intelligent vision. Let's create and enjoy the beautiful future of 3D connecting the world.