How to incorporate AI Small Language Models (SLMs) into packaging & processing equipment

This step-by-step approach breaks down what it really means to add generative AI into your packaging and processing equipment using Small Language Models (SLMs).

David Newcorn

Dec 1, 2024

For OEM's recent series of articles on incorporating AI into packaging and processing equipment, I've covered the why, and the what, as well as some off-the-shelf tools. (Briefly, the "why" centers around adding ChatGPT-style functionality to your equipment to make it much easier for operators and technicians to keep machines running efficiently on at your CPG customers.)

But so far we haven't touched on the "how" for those who want to roll up their sleeves and try this. Warning, this is not for the faint of heart, and as I point at the end, waiting for off-the-shelf tools and functionality could be easier, albeit with less ability by your engineers to shape the outcome.

1. Define the use case and objectives

Reminder, the application of generative AI that we're discussing is using a Small Language Model to create what is technically referred to as a Retrieval-Augmented Generation (RAG) chatbot, that is to say, fed specifically by all your manuals and documentation vs relying on training from the open web which is what many language models were originally trained on. So the use case is to identify what specific tasks the model needs to perform.

Taking an iterative or crawl/walk/run approach, your crawl phase would mainly consist of troubleshooting and operational guidance. A walk or run phase might consist of a predictive maintenance application. As part of this you'd want to establish metrics for success, which is mainly accuracy in responding to queries, but also speed of response. You may want to rule in or rule out specific scenarios that your chatbot would or would not be able to handle. For example, you may decide to limit the chatbot to just documentation you supply, or you may choose to give customers a chance to include their own standard operating procedures (SOPs).

2. Gather relevant training data

This includes all of the manuals and documentation for a specific machine, SOPs, and so forth. Models are rapidly expanding beyond text and able to identify the contents of a photograph, technical drawings or even video, so you could decide whether to experiment with including such content. A walk or run phase could consider incorporating aggregated operational data collected by other users of the machine across multiple customers. The idea of a customer learning from all other customers (in the aggregate) for a given machine is quite compelling and might overcome privacy and competitive concerns customer may have.

3. Data pre-processing

This step involves checking all the information to ensure it is accurate and up-to-date, removing irrelevant or outdated information, correcting any errors in the source material, and normalizing text where applicable and appropriate (converting it out of some proprietary format into a format that the AI can understand). Some labeling of data might be needed for the AI, to help the model understand different contexts, such as identifying error messages versus standard operational instructions. You may want to include a document that gives the model instructions on how to handle irrelevant or inappropriate queries.

4. Tokenizing and chunking the data and loading it into the model

An important step is to break down the content into individual tokens that language models understand. It turns out you can't just directly upload a PDF to a language model. Typically each word or even each letter in a word is a token. Then it needs to be loaded into the model inDave Newcorn, guest author of this column, owing to the flimsy confluence of having dabbled in arcane-and-useless programming languages 40 years ago and having covered the packaging machinery industry for 30+ years. Day job: PMMI Media Group president. chunks. Obviously doing that manually is tedious, which is why you would likely want to use one of the many Retrieval Augmented Generation (RAG) tools or frameworks that exist to handle this for you. Many RAG platforms provide tools for uploading documents directly, handling the tokenization, chunking and indexing of your content behind the scenes. LangChain is an open-source framework that streamlines the development of applications powered by language models. Microsoft's Azure AI Services provides tools like Azure AI Document Intelligence and Azure AI Search. Multimodal offers chunking tools for RAG applications, and Databricks offers a platform for building RAG applications. Some of these are free, others paid.

5. Testing

This is where you test by asking actual questions that operators or technicians might ask. You'll want to evaluate accuracy and speed, obviously, as well as the model's ability to handle edge cases or uncommon scenarios. You should also throw at it completely irrelevant or even inappropriate responses to see how it responds. You may need to iterate between steps four and five to get the model ready for prime time.

6. Deployment and Integration

This is where you deploy your model on a PC inside your machine, testing to see how it runs on the available hardware resources you're using. You can also test to see whether the resources impacts performance of anything else running on the PC, such as the HMI. You might explore integrating the model with the machine's HMI.

7. Continuous improvement feedback loop

You'll want to create a system where the user interactions with the model are logged for subsequent analysis. You can use this data to retrain the model periodically, improving its accuracy and relevance. As mentioned near the beginning of this article, your customer may have their own data that they want to incorporate into the model training.

Admittedly, this is a lot. In order to do this yourself, your engineers will need to learn Python (not as hard as it might seem, but not trivial), master the art and science of developing a RAG system with fast-changing tools and underlying models, and ensuring these models play nice with the existing controls architecture. Most will choose to simply wait for the large industrial automation suppliers to bake generative AI capabilities into their platforms. Others may choose to rely on third-party off-the-shelf tools.

But for a controls engineer with a strong interest in generative AI and the willingness to acquire the skills needed to train and deploy small language models, this could be an avenue worth experimenting with.

OEM Magazine is pleased to publish this semi-occasional column tracking the rapid advances in AI and how packaging and processing machine builders can leverage them to build next-generation equipment. Reach out to Dave at [email protected] and let him know what you think or what you’re working on when it comes to AI.

List: Digitalization Companies From PACK EXPO

unPACKed Podcast: Navigating Trade Shows as a Woman

Hear Amber Miller of PMMI Media Group host a special Packaging & Processing Women’s Leadership Network conversation on how women in manufacturing can get the most out of trade shows.

Generative AI's effect on manufacturing is more than some may expect.

Three Ways Generative AI is Delivering Manufacturing Results

Industry professionals note three keys to startup success, focusing on a collaboration-based mindset.

Industry Professionals Note Keys to Vertical Startup Success

Ori Cohen, president and founder of Orics Industries, is shown here in the firm’s 40,000 square foot manufacturing facility in Farmingdale, N.Y.

OEM Profile: MAP put Orics on the Map

Teams compete with their packaging machinery at PACK EXPO International after a semester and a half of building.

How The PACK Challenge Builds Tomorrow’s Packaging Workforce

List: Digitalization Companies From PACK EXPO

Looking for CPG-focused digital transformation solutions? Download our editor-curated list from PACK EXPO featuring top companies offering warehouse management, ERP, digital twin, and MES software with supply chain visibility and analytics capabilities—all tailored specifically for CPG operations.

Download Now

Products

How to incorporate AI Small Language Models (SLMs) into packaging & processing equipment

1. Define the use case and objectives

2. Gather relevant training data

3. Data pre-processing

4. Tokenizing and chunking the data and loading it into the model

5. Testing

6. Deployment and Integration

7. Continuous improvement feedback loop

List: Digitalization Companies From PACK EXPO

unPACKed Podcast: Navigating Trade Shows as a Woman

Three Ways Generative AI is Delivering Manufacturing Results

Industry Professionals Note Keys to Vertical Startup Success

OEM Profile: MAP put Orics on the Map

How The PACK Challenge Builds Tomorrow’s Packaging Workforce

Single Pair Ethernet (SPE) Switch

Wide Beam Sensor

Robot Simulation Software

Summer OEM 2025

Spring OEM 2025

Winter OEM 2024

Fall OEM 2024

Researched List: Digitalization Companies from PACK EXPO, catered to CPG

Rethinking Packaging Robotics: Prioritizing Flexibility to Combat “Brittleness”

Researched List: Small Language Models for Packaging & Processing OEMs

The Digitalization Effect