As artificial intelligence (AI) has grown in popularity, it is considered a powerful force that will change our daily lives while increasing society's overall efficiency. Recent advances in large-scale models and multimodality have also contributed to the growth of related research in recent years.
Putting AI in the spotlight, we may see various industries pushing the study of it to an extreme. However, technology development cannot be separated from real-life applications. As AI is undergoing many ups and downs at the moment, it is imperative to consider how to efficiently promote the study of large-scale models, how to steadily introduce this high-end technology to the market, and how to deepen the implementation of AI across various industries.
In this article, we invited Mr. Wang Yan, Chief AI Architect of Zuoyebang, to introduce how the leading K12 EdTech company in China is utilizing artificial intelligence to its maximum potential to provide practical solutions tailored to millions of users' needs in education scenarios.
(Image from zybang.com)
1. The foundation of educational products: three keys of question bank construction
Founded in 2015, Zuoyebang is one of the first Chinese education technology companies to develop a large question bank. This database currently contains over 540 million questions, making it the largest among its competitors in the country. As Wang Yan pointed out, the question bank's success can be attributed to three factors.
First, Zuoyebang has an inherent advantage that enables it to maintain a competitive edge. Zuoyebang was a business incubated within Baidu initially, first as a community of Q&As, then later launched as a searching and answering platform. With support from full-time and part-time teachers, Zuoyebang has built the biggest online question bank platform in China.
In addition, Zuoyebang's model is derived from Baidu Knows, a community model that encourages users to solve problems together in a way that promotes sharing and communication, which is very similar to Web forums. Instead of letting part-time college students build the question bank, as was the practice at the time, Zuoyebang undertook in-depth analysis and excavation of the content of the user's output. It gradually became clear which questions users are most concerned about in the learning scenario, which are the most difficult questions, and which are most frequently encountered. This is an essential premise that clarifies the direction of our development.
Second, Zuoyebang places great importance on developing resources and the question bank. Not only does the question bank contribute significantly to user communication, but it is also essential to answering and teaching activities.
Third, it is not sufficient to have questions, but it is also necessary to associate them with things like the knowledge points examined, the difficulty, and the use of other knowledge points and labels. In this process, tags are processed and associated with technical infrastructures, such as knowledge graphs and knowledge trees. The question bank can be efficiently retrieved and filtered, thereby maximizing its value.
The question bank was initially managed manually, but later artificial intelligence was introduced to assist in its development. For example, AI can automatically recognize images of questions, and convert them into computer-readable formats.
By utilizing AI capabilities such as automatic tagging, formatted formulas, and error correction technology, it has been possible to increase accuracy while reducing labor costs. With a series of AI acceleration technologies and the construction of the question bank, Zuoyebang has optimized their response time for search and answer to one second, while their competitors would respond in approximately eight seconds.
In projects aimed at public schools, the question bank has become a critical component of supporting teaching activities. In this regard, one noteworthy scenario is the ability to push personalized and accurate questions via a high-quality homework system. A vital feature of the system is that it will personalize question push based on data analysis of different students, including the duration of time spent solving the questions and the level of mastery of varying knowledge points. Considering that students possess varying levels of knowledge, questions should be crafted accordingly.
To ensure a precise fit between the students' needs and the system, a thorough understanding of the students must be combined with the rich tagging dimension of the question bank. With the help of the question bank, Zuoyebang has developed high-quality homework products that assist students with consolidating information.
2. How can a machine solve graphical questions automatically?
Another feature of homework scenarios, besides the question bank, is the automatic assisted correction technology. Although choice questions may have certain answers, subjective questions such as writing an essay or solving math word problems can be difficult to determine whether a student has made a mistake or how the answers should be graded. During the assessment of math questions, OCR algorithms are used to identify the content of students' answers, while NLP algorithms conduct structured analysis, such as logical analysis and identifying errors.
Further, with knowledge graphs, students are informed where they are mistaken and why, and algorithms are used to help them identify the weak points in a learning report specifically tailored to their needs. This service system employs a Zuoyebang cloud-native and multi-cloud disaster recovery system, which provides high levels of stability and reliability. Therefore, even if many schools use it simultaneously, there will be no downtime.
According to Wang Yan, their outstanding performance in the industry is also due to accumulating a large number of users over a long period and regularly evaluating their homework system. It supports more question types than similar products on the market and is more accurate.
Teachers impart knowledge through lectures, and then students test themselves to find out which knowledge points they have learned and which ones they have not. Continuous correction for students' homework and tests has become a main source of burden for teachers throughout the lecturing-doing homework-correcting closed-loop.Through the use of artificial intelligence, teachers can considerably reduce the amount of time they spend correcting students' homework and enable as many students as possible to make progress.
Currently, the homework product is in high demand and is used by thousands of teachers almost daily. Furthermore, this system can be customized to account for the different needs of teachers, as well as integrate their teaching experiences and styles. You can find out which step you missed when solving word problems in math. At its core, Zuoyebang will continue to reduce teachers' homework correction burden.
- Identify a way to solve math questions with graphs
Generally, text-only questions can be identified and matched in the question bank through OCR, text search, etc.But this is more complex for graphical questions. It is common for test papers to ask students to find the shaded area and calculate the size of a graph.
As a result, both text and images will need to be extracted. By searching the question bank with text only, the retrieval system can search for questions with similar stems, but the shape of the questions in the results obtained is different. In this case, vectorized feature extraction is necessary. Digital vector representations combined with the features of a large number of questions form a "text + image" feature.
This is especially true of questions in primary schools, where there is often a mixture of text and images and the structured relationships between the boxes, including the line's starting point and its trajectory. The same applies to drawing questions.
- Test paper restoration: magic tools are often rooted in daily life
Additionally, Zuoyebang has gained a number of patents in the areas of OCR, voice and image recognition, and homework correction technology. A patent that Zuoyebang has disclosed, for example, is for the efficient correction of distorted images using artificial intelligence. Teachers often ask students to redo their incorrect answers to make sure they really understand and have learnt the knowledge. Thus, the test paper would need to be reverted to blank. There are, however, times when the handwriting is uneven and the text distorted after taking photographs; therefore, the problem of typographical correction requires specific technology to solve.
Using deep neural networks, Zuoyebang can recognize human handwriting and distinguish it from printing fonts. With assistance such as imaging enhancement technology, it can effectively restore the test paper. This fucntion is now available in the Zuoyebang app and has been applied to the printers to restore test papers to their initial states within a short time. Formerly, students had to copy the questions manually and redo them again, but now the process has been automated. Plus, this technology is used not only for restoring test papers but also for correcting and beautifying homework for online classes, improving the preservation of papers, and ensuring content recognition accuracy.
- Knowledge graphs: the gathering of expert knowledge
In the education scenario, the construction of a knowledge graph cannot be separated from the system of the human learning experience. Zuoyebang's semantic network is based primarily on lectures. Many teachers have helped summarize the relationships, dependencies, and paths to learning knowledge points during the teaching process. A prototype of a knowledge graph can be created by connecting these discrete knowledge points into a network.
R&D uses automated AI machine learning capabilities to implement the system on a large scale based on teachers' expertise and experience. By utilizing the knowledge graph, Zuoyebang could take customized homework design to the next level by recommending relevant or challenging questions that included more profound knowledge points. Zuoyebang uses knowledge graphs in various contexts, including lectures, homework corrections, personalized learning, and the association of questions in the question bank, allowing for more accurate retrieval and recommendation.
3. AI and digitalization: respecting users' habits
Imperative digitalization will be required for traditionally used paper books, board writings, slides, etc., as well as student responses, including homework, test answers, and test scores. Without converting them into computer-readable data, all those advanced technologies will become useless, and even technologies used to retrieve and recommend information will have no benefit. Due to this, voice and image are essential mediums for conveying teaching content, which should be digitized.
China has been promoting the digital transformation of education for many years, and many classrooms now have digital screens and digitized teaching materials. Now, Zuoyebang is promoting the digitalization of homework. While AI may be beneficial to teaching and learning, it is essential to remember that it should respect the original habits of students and teachers, which cannot be easily altered. It might be a little uncomfortable for students and teachers who were used to taking tests on paper if we suddenly ask them to do all their tests online. Even though tests can already be digitalized and conducted online now, the changing habits of students and teachers will make this difficult to widely implement.
Accordingly, Zuoyebang innovated the homework system by introducing the "original paper with correction notes" feature. Teachers can use this feature to correct digitalized test papers online just as they did on physical test papers, and students will be able to read their teachers' marks and notes once they have received their results.
According to Wang Yan, the key to a successful digital future is changing the mindset, lowering the threshold for leveraging technology, and digitizing without affecting our way of living.
Several new requirements exist as we move from homework to a larger picture of education. Physical education teachers, for example, are very concerned with the intensity of exercise their students can withstand during class, including heart rate monitoring. A feature that reminds students to rest if their heart rate becomes too high is essential for PE teachers. The second example is counting jump rope reps, where we do not use a counter but rely on the camera to identify and count the reps automatically. Additionally, the process of capturing body movement is also considered a helpful technology to assist students in checking whether a particular move is appropriate. All of these applications are suitable for use with artificial intelligence.
- Identifying opportunities for the use of artificial intelligence
As a technology-driven company, Zuoyebang's development team often focuses on what other technological advancements might be helpful or whether a good technology may enable the impossible to become possible. Accordingly, Wang Yan outlines the following logic for finding opportunities for AI: "We should first be clear about what technologies and resources we have and then determine how to apply those technologies in specific situations." Next, it is necessary to consider and weigh the potential of each technology and then conduct pilot and optimization plans.
- B2B scenarios demand more accuracy
Unlike B2C scenarios, B2B scenarios require a more customized approach. School systems, for example, have a higher standard for correcting papers and cannot make any errors. B2C products emphasize the richness of their capabilities and the ease of use, but accuracy is not as critical.
4. What does the future hold for the AI industry?
- R&D is the cornerstone of success; cutting-edge technology allows for more opportunities
Optimizing fundamental R&D can significantly improve applications' performance, so investing in basic research and development is imperative. Meanwhile, cutting-edge technology research may present more opportunities. In light of the continued growth of technology, what was previously impossible may one day become possible, and fellows are encouraged to devote 20% to 30% of their energy to focusing and following up on specific projects.
When selecting candidates for our company, Zuoyebang looks for individuals with a particular aptitude for academic research and engineering abilities. Strong engineering capability leads to more robust implementation and, thus, guarantees the success of AI. Ideally, we would like the talent to have the ability to perform full-stack operations and to be able to independently carry out the experimental design and execution to verify the effect of specific innovations in the field more quickly.
- Model scale matters, but so do the applications
While artificial intelligence can be applied to various fields, most of these models are based on general scenarios, so they cannot be used in education. This point was illustrated by Wang Yan's example of handwriting recognition: normally the algorithm model assumes that the handwriting is that of adults. However, in the K12 education scenario, students of different ages have different writing styles, and neatness is not that important for teenagers's writing. Thus, AI needs to be refined based on specific conditions for use in education. In this manner, it is possible to explore and discover new business needs and promote the development of relevant technologies practically.
- Widely used products must be affordable to the general public.
While large-scale models have shown improved performance, they are still far from widely used today. According to Wang Yan, large-scale models and multimodal research can improve accuracy, but this one percent improvement is often realized at the cost of enormous computing power.
Nowadays, large-scale models with hundreds of millions and trillions of parameters require enormous clusters. If no large clusters were available, the model would run for a long time. Indeed, cluster hardware is constantly improving, and the associated costs of computing power are also decreasing, but the technology that can be widely adopted must be low-cost and affordable. The wide use of artificial intelligence lies more in the innovation of ideas and its cost-effectiveness.
Bringing the hype-like features of the technology to life and making them available to thousands of users is the key to making it a success. As Zuoyebang has millions of customers, the company needs to optimize its business operations to avoid incurring additional costs.
Zuoyebang now aims to offer users as many features and services as possible while maintaining affordability. The team will examine how to improve the use of computing power so that the device is not idle while also considering how to enhance and optimize the model and engineering architecture to accommodate a large number of transactions per second at the most reasonable cost.
Innovation can take place only through a problem-solving approach so that the tool can release more value, allowing more people to experience the convenience of technology as it advances.
Guest Introduction
Formerly the director of Baidu Knows and Baidu Baike, Mr. Wang Yan now is the chief architect and head of the AI lab of Zuoyebang. Mr. Wang now directs the lab to work on various sectors of artificial intelligence, including image technology, large-scale and high-concurrent online architecture, question search and answering, AI correction, and question banks.