Download - Maritime Security Challenge - Thales
Institut Mines-Télécom
Maritime security
challenge
London, 19th of January
Institut Mines-Télécom
Plan
1.Thales presentation
2.Context
3.Objectives
4.Data sources
5.Application fields & technical environment
6.Selection process and next steps
23/01/2015 Marine Traffic Data Challenge2
Institut Mines-Télécom
THALES presentation
Thales Group is a global technology leader for the Defence & Security and the Aerospace & Transport
markets.
The company generated revenues of €14bn annually. It employs over 67,000 employees in more than
50 countries.
The Center for Information Treatment and Analysis (CENTAI) lab within the Advanced Studies
Department of Thales envisions the future approaches, architectures and technologies for data analysis
and visualization.
The CENTAI team is especially focusing on developing and applying novel machine learning/statistical
techniques in multiple domains (Transport, Cyber Security, Social media …) where “big data” is often a
challenge.
Thales is working on the detection of abnormal behavior of marine traffic. The CENTAI lab team has
already developed a prototype. This prototype defines abnormal state patterns based on different tasks
fields (trajectory, stops, times of stops). This challenge proposes to develop a better abnormal behavior
detection system.
23/01/2015 Marine Traffic Data Challenge3
Institut Mines-Télécom
Maritime Security Context
The council of the European Union issued in June 2014 a
European Union maritime security strategy in which it
emphasizes the importance of maritime security for
Europe.
“More than 70% of the external borders of the Union are
maritime and hundreds of millions of passengers pass
through its ports each year. Europe's energy security
largely depends on maritime transport and
infrastructures.”
The main kinds of suspicious activities are:
• Illegal fishing (types of fish, fishing quota, fishing in
protected areas…)
• Organized crime activities: piracy, human, drug &
counterfeit goods trafficking
• Activities linked with nuclear proliferation
• Activities harming the environment: pollution due to
illegal or accidental discharge (fuel, chemical,
biological, nuclear products)
23/01/2015 Marine Traffic Data Challenge4
Institut Mines-Télécom
Objectives
The objective of Thales is to bring answers to European
maritime security by finding new ways of mining marine
traffic data in order to provide useful maritime security
services for national, European agencies but also for
companies vulnerable to maritime threats.
By bringing in this challenge a Thales owned dataset mixing
both private and open marine traffic related data, Thales
wants to help the data scientists to address the maritime
security challenges
Here are following examples suggested as study subject,
beside that Thales is open to any new fields or ideas or way
to improve maritime security.
• Suspicious boat trajectories detection in order to
maximize the efficiency of boat controls which are
currently made randomly. To accomplish this task
one of the sub-challenge which could be
interesting to investigate is the Prediction of boat
trip duration to detect anomalous durations
• Boat trajectories classification in order to
discriminate as finely as possible boat activates
according to their behaviors (if possible we wish to
go as deep as identifying the types of fish a fishing
boat is chasing)
23/01/2015 Marine Traffic Data Challenge5
Institut Mines-Télécom
Data Sources (1/2)
Thales datasets
• AIS (Automatic Identification System) data on the South East Asia (see picture below) area
collected on a 6-month period (18 million of messages). All boats above a given size are legally
required to have an AIS transmitter onboard. Each AIS transmitter has an id which uniquely
identifies the boat on which it is installed.
─ MMSI (id of the AIS device) – correlation with Lloyds open dataset to find the IMO number
(boat identifier)
─ Timestamp
─ Geolocation: latitude, longitude
─ Local trajectory: speed, heading, rate of turn
Open data*
• Equasis
─ EU blacklist
─ Safety control reports
• Lloyds register: vessel database
─ Ship id, type, length, draught
─ Engine type, number of engines
• Greenpeace
─ List of vessels and companies which have been recorded engaging in IUU fishing activities
(Illegal, Unregulated, Unreported)
─ Iranian oil tanker blacklist
23/01/2015 Marine Traffic Data Challenge6
* (will be aligned on the AIS in terms of time period)
Institut Mines-Télécom
Data Sources (2/2)
Weather data
• NOAA (National Oceanic and Atmospheric Administration): worldwide grid data but more precise
around the US (15km*15km vs 50km*50km). These data correspond to weather observation data
(not forecast data). Observations are updated every 3h and come from buoys and satellites.
─ Wind (direction, strength, gust, isDirectionVariable)
─ Current (direction, strength)
─ Temperature (surface and dew point)
─ Waves (height, speed, direction).
• OpenWeatherMap
─ API to collect data from NOAA for observations + forecast according to Canadian weather
model (forecast for all the observation data at T+3h, T+6h, T+12h, bad prediction after 3h for
the wind)
Additional datasets
• AIS data can be daily collected on public Websites such as MarineTraffic.com with dedicated
Web crawlers (1 million messages per day with a worldwide coverage). IMT has already done a
Proof of Concept of such a crawler. Such data could be used to enlarge the available dataset.
• Other interesting open datasets: wet market data (mostly available for the US. Availability to be
checked for South East Asia)
• Other datasets from public bodies, private companies are welcome
Global sizing: 6 months time period data, global sizing between 20 and 25 gigs
23/01/2015 Marine Traffic Data Challenge7
Institut Mines-Télécom
Application fields & technical environment
Primary application fields:
• Data management and data enrichment (time&space series, geographical information system)
• Detection algorithms and distributed machine learning algorithms (like MLIB/Sparkling Water)
Secondary application fields:
• Visualization/Presentation of suitable business oriented results (javascript libraries, mapbox)
• Performance optimization and scalability
Technical environment
• OS : CentOS, possibly other linux distributions
• Hadoop distribution : Hortonworks, possibly Cloudera
• Clusters : Spark, Storm, Elastic search…
• Workspace : Dedicated and secure workspace provided (https/ssh)
23/01/2015 Marine Traffic Data Challenge8
Performance optimization
Data visualization
Data
Management
Algorithms
& Machine
learning
Institut Mines-Télécom
Selection process and next steps
The team has to present a detailed methodology with enclosed references and research papers for the
detection of abnormal behavior of marine traffic.
Particular attention would be given to the good understanding of the issue, the algorithmic and
technical mastering and the expected output.
After the selection of candidates a monthly project meeting will be held with Thales and EIT ICT Labs
team. The candidates can apply until March 2015.
23/01/2015 Marine Traffic Data Challenge9