Putting Machine Learning (ML) and Artificial Intelligence (AI) to work

Todor Popov – Data Scientist at Cobuilder tells the story behind the establishment of the organisation’s Machine Learning team. Today, the team has successfully developed a solution that saves 60% of manual work and helps the company deliver value to its clients in an efficient and modern way. Through their work Cobuilder experts can now attribute more time to providing value-added work such as quality control and worry less about repetitive manual tasks.

You don’t need to be part of the Information and Communications Technology (ICT) sector to be familiar with the hype around Machine Learning (ML) and Artificial Intelligence (AI). From face-recognition in Facebook to Siri and Cortana, everyone has already had some experience with AI and ML. However, the application of these technologies in day-to-day business is far from mature.

At Cobuilder, the strive to make the most of AI and ML started in 2017. As a forward-thinking company we systematically look at new technologies and utilise them as part of our services. In this case we took an intrapreneurial approach towards investigating conceivable gains towards AI & ML and arranged a weekend hackathon.

The start of a hackathon tradition

The first Hackathon aimed to solve two problems directly related to some of our core services: the transformation of unstructured information into machine-readable data. Cobuilder helps construction companies in creating data models based on standards in order to enable them structure data consistently across their organisations and exchange it with the numerous actors in the construction supply chain.

“… in under 48 hours we were able to prove that a complex problem we had been struggling with for years could be solved by using AI”

The aim was to speed up those processes of extracting unstructured data from PDF files to our structured data models by reducing repetitive work and increasing efficiency in the organisation.

The results were more than promising: in under 48 hours we were able to prove that a complex problem we had been struggling with for years could be solved by using AI. The essential reason for that:  we boiled the complex problem down to a classification problem – a technique that categorizes data into distinct and desired number of classes where we can assign label to each class – that was something quite suitable for AI to solve.

During the hackathon our teams analysed the problem and the relevant data, developed a proposed solution and tested its performance. Not only the results were promising but the two teams participating in the hackathon achieved success using dissimilar approaches.

The hackathon teams secluded in a room are doing some real implementation

This was the first of a series of hackathons that have become a tradition at Cobuilder. Hackathons have proved to be a good tool for solving complex problems that require combining diverse expertise from multiple departments.

The beginning of the Machine Learning team at Cobuilder

Overall, this first hackathon experience was successful in more ways than showcasing the benefits of incorporating AI in our company workflows. Another benefit was that it gave us a good insight as to what kind of problems a hackathon format could solve.

After validating the idea within the hackathon, the company management decided on dedicating resources for AI/ML projects within the company. Nowadays, the Machine Learning team has 3 members. They were ‘head-hunted’ from internal departments thus coupling core company knowledge with the power of ML and AI.

Todor Popov, Pavel Pavlov and Valentin Zmiycharov – the members of the AI and ML Team at Cobuilder from left to right

The first project that officially made the roadmap was related to interpreting Safety Data Sheets (SDS). It was about turning them from PDF files to machine-readable data – something the company had been doing manually for more than a decade. SDS’s are an integral part of the construction industry. They are a tool that provides harmonized information on the hazards of working with construction materials in an occupational setting. The Automated Loading and Interpreting of SDS’s, or ALIS, is designed to aid our colleagues in digitising SDS’s. The approach is for ALIS to extract the information for the more common cases, leaving for the experts to confirm, append and if necessary, change the extracted data. ALIS has more than 70% accuracy in extracting the correct data from the unstructured PDF and populating it in the right digital template. This process efficiency reduces manual labour by up to 60%.

How it works – the technical stuff

Our experience and knowledge in handling SDS’s combined with the sufficient amount of available data gathered in our databases makes the necessary conditions and stepping-stones for ALIS to emerge and fulfil its purpose.  Furthermore, Natural Language Processing (NLP), Image processing and deep neural networks are used to extract and transform the required data from PDF files. The whole algorithm bundle is put nicely in a Python package. However, providing a package solution is only a part of a data science engineer’s day. Another major part is deployment and integration within the company systems.

“… AI have helped us achieving higher quality services due to improved workflows, and we can produce features that have been unfeasible without using AI”

The ALIS package is deployed as a FLASK API packed nicely with Docker and deployed on Azure. The FLASK microframework is well documented and is very lean. That makes it perfect for such tasks.  Furthermore, the Docker-Azure combo works well for multiplatform development and provides no hurdle deployment. Another plus of using Azure is the stability, reliability and security it provides, things that the company values and identifies with.

Our plans for the future?

So far, AI have helped us achieving higher quality services due to improved workflows. We can now produce features that have been unfeasible without using AI. This really brings on a competitive advantage which, of course, extends to our customers. For instance, experts now have more time to do thorough quality control checks of the digital SDS in our systems and give guidance on specific improvements that are needed to meet the REACH requirements. The process of digitally processing SDSs is quite complex, however reducing repetitive manual work and replacing it with quality control is a huge leap in the right direction in terms of service quality.

“We, and our customers, can focus on innovation, creation and other value-adding work, while computers are doing all the manual and repetitive tasks.”

Currently, our second project which will allow much higher speed of processing SDS’s is in its testing phase. Another project related to classifying construction products is also in progress. This would enable our colleagues to find the relevant classifications quicker and handle much bigger volumes of data more easily.

Furthermore, there are yet two projects for new functionalities planned for the near future. One of them relates to detecting semantic data duplicates and assessing data similarities in our systems. The second one relates to interactive data visualisation tools addressing data communication. This would enable us to easily illustrate our systems in depth, beneath the complexity on the surface.

There are many opportunities ahead. Our customers will take advantage of faster, more reliable services. Making data machine-readable is something of a credo at Cobuilder. This means that people who create value in our business AND in our customers’ businesses will be focused on innovating, creating and delivering value. And all of this while computers are doing all the manual and repetitive work.