<<

Computer-Aided Technologies for Food Risk Assessment

Candidate: Francesco Rossi Tutor: Prof. Alfredo Benso

XXX Cycle PhD in Computer and Control Eng. at Politecnico di Torino Motivation of PhD in Computer Science

Master’s Degree Virtual Prostate Cancer Biomedical Engineering Colonoscopy Research

#machinelearning #computervision #computeraided Computer-Aided Technologies for Food Risk Assessment Computer-Aided Technologies for Food Risk Assessment Computer-Aided Technologies for Food Risk Assessment Computer-Aided Technologies for Food Risk Assessment

Unitentional Intentional

Economical FOOD FOOD Gain QUALITY FRAUD

Health FOOD FOOD Harm SAFETY DEFENSE Outline

Dairy Product

Traceability Species Substitution Heuristic Molecular

Dairy Farming Analysis “From farm to Fork”

Food Traceability

Qualitative and Quantitative certification

STR-DNA Pool Analisys Heuristic Molecular

Dairy Farming Analysis “From farm to Fork”

Get around the pool problem Data  2 farms for 12 months

1. Sample Collection (DNA) 2. STRs selection (20) 3. Genotyping Process (STR) 4. Data extraction (RFU) Cow → 2 alleles Pool → vector of alleles Assumption N ∑ ≈ i i = 0

On-site contamination Some loss during & some possible loss ripening Correct Medium Correct Forgery Medium High Correct Forgery Forgery HEURISTIC DATA SIMULATION COWs CMA-ES DATA

POOL W Every dairy product (i.e. 46 pools) has been Predicted POOL simulated 24 times

P1 & P2 SSE Heuristic Analysis 0-50-100 % forgery

SCORE HEURISTIC DATA SIMULATION COWs CMA-ES CMA-ES

POOL W Covariance Matrix Adaptation Evolution Strategy

Predicted POOL

P1 & P2 SSE

SCORE HEURISTIC DATA SIMULATION COWs CMA-ES W

POOL W N

Predicted POOL w ≈ ∑ i i P1 & P2 SSE i = 0

SCORE HEURISTIC DATA SIMULATION COWs CMA-ES W

POOL W 0.5 lower boundary= m 3 upper boundary=max ,1 m Predicted ( ) POOL

P1 & P2 SSE

SCORE HEURISTIC DATA SIMULATION COWs CMA-ES Predicted POOL POOL W CORRECT POOL 12000 10000 8000 Predicted 6000 POOL 4000 2000 0

P1 & P2 SSE 7000 ALTERATED POOL 6000 5000 4000 3000 2000 SCORE 1000 0 HEURISTIC DATA SIMULATION

COWs CMA-ES P1 - forgery Rate of alleles that are included in POOL W the pool’s profile but not in the cows‘

Predicted POOL P2 - loss by ripening Percentage of alleles within the cows’ P1 & P2 SSE profile but not detected in the pool

SCORE HEURISTIC DATA SIMULATION

COWs CMA-ES SSE Sum of Squared Errors between POOL W original and predicted pool

Predicted POOL SCORE P1 & P2 SSE = P1 · P2 · SSE

SCORE Results Results grouped by pool’s type Final Results & Conclusion Vision and Learning

in Fish Species Identification “OK il Pesce è Giusto”

Fish Species Substitution Introduction Introduction

VISUAL FEATURES Introduction

VISUAL FEATURES

FEATURE ENGINEERING Introduction

VISUAL FEATURES

FEATURE ENGINEERING

CLASSIFICATION MODEL

Feature engineering turn your inputs into things the algorithm can understand

DATA Raw Image DATA

Segmentation DATA

12 Key-Points DATA

INFORMATION 30 features «Is really this the species declared on the label or not?»

INFORMATION KNOWLEDGE Dataset COMMON NAME LATIN NAME N° European Anchovy Engraulis encrasicolus 125 European Pilchard Sardina pilchardus 107 erythrinus 20 339 Atlantic Mackerel Scomber scombrus 18 Gilt-Head Bream Sparus aurata 22 European Hake 19 Striped Red Mullet Mullus surmuletus 28

Classification Model

ANNs: One-Class & Multi-Class Classifiers Results Leave One Out Cross and in-field validation

OCC 100% Acc. MCC 100% Acc (only with sardine and anchovy)

Engraulis encrasicolus Sardina pilchardus MCC Pagellus erythrinus Scomber scombrus Sparus aurata OCC Merluccius merluccius Mullus surmuletus RAI1 - Live Demo RAI1 - Live Demo Limitations

 New order/family/genre/species of fish may require new features

 Segmentation will fail if background is not uniform, and key-points and features too

 The two step for key-points should be improved

 Key-points interaction should be avoided FEATURES CLASSIFIER OUTPUT FEATURES CLASSIFIER OUTPUT

? ? ? OUTPUT FEATURES CLASSIFIER OUTPUT

DEEP LEARNING OUTPUT F.I.S.HUB Fish Identification Software Hub

e r p n n s fie p io io a i A t it b ss e a n ta a il lid g a l b a o D C o V c M e R ts e ill F Database

GUIDELINES

25 species > 7k pictures UK & Italy Database ORDER FAMILY GENUS SPECIES Clupeiformes Clupeidae Clupea harengus Clupeiformes Clupeidae Sardina pilchardus Clupeiformes Clupeidae Sprattus sprattus Clupeiformes Engraulidae Engraulis encrasicolus Gadiformes Gadidae Gadus morhua Gadiformes Gadidae Melanogrammus aeglefinus Gadiformes Gadidae Merlangius merlangus Gadiformes Gadidae Pollachius virens Gadiformes Merluccidae Merluccius merluccius Sparidae dentex Perciformes Sparidae Dentex gibbosus Perciformes Sparidae annularis Perciformes Sparidae Perciformes Sparidae Pagellus bogaraveo Perciformes Sparidae Pagellus erythrinus Perciformes Sparidae caeruleostictus Perciformes Sparidae Pagrus pagrus Pleuronectiformes Pleuronectidae Hippoglossus hippoglossus Pleuronectiformes Pleuronectidae Limanda limanda Pleuronectiformes Pleuronectidae Microstomus kitt Pleuronectiformes Pleuronectidae Pleuronectes platessa Pleuronectiformes Pleuronectidae Reinhardtius hippoglossoides Pleuronectiformes Scophthalmidae Psetta maxima Pleuronectiformes Scophthalmidae Scophthalmus rhombus Pleuronectiformes Soleidae Solea vulgaris Database ORDER FAMILY GENUS SPECIES Clupeiformes Clupeidae Clupea harengus Clupeiformes Clupeidae Sardina pilchardus Clupeiformes Clupeidae Sprattus sprattus Clupeiformes Engraulidae Engraulis encrasicolus Gadiformes Gadidae Gadus morhua Gadiformes Gadidae Melanogrammus aeglefinus Gadiformes Gadidae Merlangius merlangus Gadiformes Gadidae Pollachius virens NORMAL Gadiformes Merluccidae Merluccius merluccius Perciformes Sparidae Dentex dentex FISH Perciformes Sparidae Dentex gibbosus Perciformes Sparidae Perciformes Sparidae Pagellus acarne Perciformes Sparidae Pagellus bogaraveo Perciformes Sparidae Pagellus erythrinus Perciformes Sparidae Pagrus caeruleostictus Perciformes Sparidae Pagrus pagrus Pleuronectiformes Pleuronectidae Hippoglossus hippoglossus Pleuronectiformes Pleuronectidae Limanda limanda Pleuronectiformes Pleuronectidae Microstomus kitt Pleuronectiformes Pleuronectidae Pleuronectes platessa FLATFISH Pleuronectiformes Pleuronectidae Reinhardtius hippoglossoides Pleuronectiformes Scophthalmidae Psetta maxima Pleuronectiformes Scophthalmidae Scophthalmus rhombus Pleuronectiformes Soleidae Solea vulgaris Database ORDER FAMILY GENUS SPECIES Clupeiformes Clupeidae Clupea harengus Clupeiformes Clupeidae Sardina pilchardus MANY Clupeiformes Clupeidae Sprattus sprattus Clupeiformes Engraulidae Engraulis encrasicolus Gadiformes Gadidae Gadus morhua Gadiformes Gadidae Melanogrammus aeglefinus Gadiformes Gadidae Merlangius merlangus Gadiformes Gadidae Pollachius virens Gadiformes Merluccidae Merluccius merluccius Perciformes Sparidae Dentex dentex Perciformes Sparidae Dentex gibbosus Perciformes Sparidae Diplodus annularis FEW Perciformes Sparidae Pagellus acarne Perciformes Sparidae Pagellus bogaraveo Perciformes Sparidae Pagellus erythrinus Perciformes Sparidae Pagrus caeruleostictus Perciformes Sparidae Pagrus pagrus Pleuronectiformes Pleuronectidae Hippoglossus hippoglossus Pleuronectiformes Pleuronectidae Limanda limanda Pleuronectiformes Pleuronectidae Microstomus kitt Pleuronectiformes Pleuronectidae Pleuronectes platessa Pleuronectiformes Pleuronectidae Reinhardtius hippoglossoides Pleuronectiformes Scophthalmidae Psetta maxima Pleuronectiformes Scophthalmidae Scophthalmus rhombus Pleuronectiformes Soleidae Solea vulgaris Classifier Examples good m is the metric

Melanogrammus aeglefinus m=0,052

Engraulius encrasicolus m= 0,023

Psetta maxima m=0,015 Examples wrong m is the metric

Sardina pilchardus Sprattus sprattus m= 1,129

Hippoglossus hippoglossus Microstomus kitt m=1,612

Merlangius merlangus Pollachius virens m= 1,741 Classifier - Results

t-sne plot for species clustering Features embedding in 10 dimension Classifier - Results

t-sne plot for species clustering Features embedding in 10 dimension Classifier - Results

KNN

t-sne plot for species clustering Features embedding in 10 dimension Mobile Application & Live Demo Record Validation

FISHUB Database IN FIELD with DNA analysis

Global Acc. 94% 69 fish sampled with pictures and DNA barcode e.g. Dentex dentex Acc. 86% Diplodus annularis Acc. 96% 9 fraud found by DNA Sardina pilchardus Acc. 96% 5 fraud discovered by App The Species Acc. is higher when the number of picture per species 62 correct identification by App are elevated and vice-versa Fillets Recognition EXPLORATORY TASK

Legend

Normal image

FILLET + RESULTS Microscope

Toluidine + blue stain

Molecular sensor

+ + NIR - SCiO

The Near-Infrared spectroscopic is a method that makes use of the electromagnetic spectrum from about 700 nm to 2500 nm. It can penetrate tissues and it is useful to probe bulk material with essentially no preparation

Acquisition Visualization Data Model For the POC we selected 10 fish fillets for each considered species: Solea solea, Pleuronectes platessa, Pangasianodon hypophthalmus.

Global Accuracy METHOD Details % 1 normal img 59.8

2 microscope img 49.5

+ 3 normal img + staining 56.6

+ 4 microscope img + staining 53.2

5 SCiO NIR 98.3

For methods 1-2-3-4 the pictures were collected and processed with image feature extraction techniques such as: Scale-Invariant Feature Transform (SIFT), Speeded Up Robust Feature (SURF) and Gray Level Co-Occurrence Matrix (GLCM). Eventually a SVM classifier was used. Latest News

• Gadus morhua 3 random scans per fillets • Pleuronectes platessa 40 fillets per species • Pollachius virens • Epinephelus costae • Synaptura cadenati RESULTS • Sebastes norvegicus • Merluccius merluccius 98% Accuracy • Scomber scombrus Conclusion

Bio* Data Interdisciplinarity

Computer Science Model PhD List of Publications

1 Journal Paper *

3 Conference Proceeding *

3 Abstract

* first author Computer-Aided Technologies for Food Risk Assessment

Candidate: Francesco Rossi Tutor: Prof. Alfredo Benso

XXX Cycle PhD in Computer and Control Eng. at Politecnico di Torino N° Name 15 STR 10 STR 1 AGLA29 ok ok 2 MB025 ok ok 3 Z27077 ok ok 4 BMS2142 5 MB071 ok ok 6 SRC276 7 MB064 8 BM1706 ok ok 9 HUJ625 ok 10 BMC1207 ok 11 AGLA232 ok 12 BM3507 13 BM4602 14 BMC6020 ok ok 15 INRA133 ok ok 16 BMS0607 ok ok 17 BMC4214 ok ok 18 BM720 ok 19 BMS4044 ok ok 20 BM0143 ok

GTX 580 DGX-2

600 W 10 kW 1.5 GB 512 GB

400€ 420 k€ Processed Assumes Beer-Lambert model is valid, and transforms the measured signal to be linear with concentration by doing a log transform and adjusting the result for noise and deviations from the model.

Normalized Performs normalization of the signal. This is meant to compensate for changing measurement conditions (e.g. varied scanning distances) that typically occur from sample to sample. Y axis still means reflectance but in normalized units instead of raw reflectance.

Processed and Normalized First assumes Beer-Lambert model (Processed) and then normalizes the results to compensate for differences in the optical path between samples. This is useful, for example, when there is variation in the thickness of the samples.

(log(R))” and Normalized Similar to Processed and Normalized, uses a more aggressive form of Processed. Adds more noise, but in some cases may be the only way to create a good model.