neatComponents is the hybrid-cloud database engine that powers clearString. | ||
Previous page | AI | Next page |
Train a Model |
![]() | ![]() |
Once a Model has been created, it can be Trained. The essence of this is to present the model with a set of images, and for each image identify which category it is in. Each image can be in a single category. Training images Number of images to use for training There is no set minimum or maximum number of images that should be used for training. The optimum number will depend on the nature of the images, how distinct each category of image is to each other. As a rough guide, somewhere between 10 and 50 images for each category will likely give good results. Number of categories There is no limit to the number of categories. The minimum is two, but there could be hundreds or even thousands of different categories. Note that as far as the ML model is concerned, categories are simply identified by providing the category name as a text string, and when it provides classification, it will do so by providing a text string for the predicted category. This string can then be matched against the name field in a clearString table for further processing where there would be a record for each category, but the ML system is not concerned with is aspect and simply works with text strings for the categorisation. Testing images In addition to accepting a set of categorised images for training, the action also accepts a separate set for testing. While this is optional, it is highly recommended, in order to give numerical feedback on the expected effectiveness of the model to correctly identify images within each category. Typically the testing set is smaller than the training set (so if say you are training with 50 images per category, the testing set would be an additional 5 images per category). For the testing metrics to be valid, it is important that the training and testing images are different from each other. ie the model must not have been trained on any of the images that are presented for training. To Train a model Use the ML - Train Model event action This action must run in background execution. The length of time it takes to run will be approximately proportionate to the number of images being presented as part of the training, and will be dependent on the speed of the underlying hardware on the server. It could therefore take several seconds or even minutes to process. You should therefore consider building in logic to use the Diagnostics fields to detect when training is completed, to avoid users attempting to use the model before it is ready. On the Configuration tab: Input Model to Train The ModelID of the Model that will be trained. Use the ML - Create Model action to obtain this ModelID and store it in a table field for future reference. Training Data A query that contains a set of records. The sequence of the records is not important. The records should contain two fields:
Testing Data A query that contains a set of records. The sequence of the records is not important. The records should contain two fields:
Diagnostics This section gives the standard event action diagnostic feedback. Completed OK Store in a Checkbox field Feedback Store in a Text field Testing Summary Store in a Large Text Field This provides detailed feedback on the training, including an assessment of how well it has been trained, based on how well it recognised the testing images.
Retraining It is quite usual to iteratively improve a model by adding images and retraining. To do this, add more images to the dataset in your tables, and then re-run the training action. The AI model will forget everything from its earlier training and retrain from the new images presented to it. This means that you must present everything as part of the retraining, you cannot just provide it with the additions. |
|