“Big Data” and What it Will Mean for Maritime Training: “The Art of the Possible”
Maybe you’ve heard the term “Big Data” before. Maybe you haven’t. It is already providing insights into all manner of human interaction - including how people learn and how we train them. Its potential is unparalleled in history. My belief is that big data could easily be the single largest driver of training improvement since learning began. So what is big data, and how does it have the potential to vastly improve maritime training?
Maritime Mentoring: International Maritime Mentoring Community - Find a Mentor, Be a Mentor
Maybe you’ve heard the term “Big Data” before. Maybe you haven’t. Either way, you’ll be hearing more and more about big data soon enough. It is already providing insights into all manner of human interaction - including how people learn and how we train them. Its potential is unparalleled in history. My belief is that big data could easily be the single largest driver of training improvement since learning began.
My favourite quote about the potential of big data is “At IBM, big data is about the art of the possible” (from Four Vendor Views on Big Data and Big Data Analytics: IBM). Deriving insights from big data is akin to an art - and like most art, there are few limits or boundaries.
So what is big data, why am I talking about it here, and how does it have the potential to vastly improve maritime training?
As an aside - if you have not already done so, please feel free to click here to sign up to receive notifications of future maritime training articles.
The term “Big Data” provides the first insight into what it is - but not much more than that. It is data, and there is a lot of it. But big data is not really only about the data itself. It is also about how computers have advanced to the point in terms of speed and capacity to not only generate this data, but also to be able to sift through it in order to derive conclusions hidden deep within it. This is a form of data mining - at term which you may be familiar with. So big data is about collecting a tremendous amount of often fine-grained information, and then telling computers to sift through that mountain of information to find the valuable hidden nuggets of knowledge we are looking for and which would otherwise be invisible to us.
The thing about big data is that it can refer to almost any kind of data. We (as in “humanity”) collect a LOT of data every day, and the daily amount goes up day by day with no end in sight. By some estimates, of all of the data that exists now in the world, 90% has been created in just the last two years. Even more interesting is that our newly acquired understanding of the potential for big data is going to cause us to collect more and more of it. We are just at the very beginning of an exploding phenomenon.
So what kind of data are we collecting (or could we collect)? Well - just about everything. Weather data. Video recordings. E-mail interactions. Social network interactions. Web browsing patterns. Purchasing habits. And most interesting to people involved in maritime training - learning experiences, interactions and outcomes. Anything from trainee grades, to how they navigate through learning materials, to how they perform on simulators. There is hardly a human endeavour in existence now that does not leave (or does not have the potential to leave) a rapidly expanding trail of data behind it.Technology is the great enabler in big data.
This tremendous breadth is both frightening and exciting. You would think that someone with a strong understanding of technology such as myself would not be taken aback by the kinds of data that are collected daily. But recently I purchased a Google Nexus 7 tablet (great tablet, by the way). When I took it out of the box it asked me for my Google password - it already knew my e-mail address. Once it had my password it then proceeded to populate itself with more or less my entire life. It knew where I lived, where I worked, the websites I frequented and much more. An astonishing amount of data collected from several years of carefully observing me as a Google user. Even I, who should know better, was impressed - and not just a little concerned.
But putting aside the issues of data ownership and privacy for the moment, I was also immediately reminded of the incredible potential. Big data is a source of information. A big source. And information is at the heart of knowledge. Big knowledge. As big data sources grow, the task for researchers is to mine that data in an attempt to understand more about humans, what they do, and how they do it. This is the “art” part - the “Art of the Possible” as IBM says.
As I alluded to above, the thing about big data is that we are just at the beginning of it. Now that we are beginning to understand its potential, people are on the constant lookout for sources of big data that we are not yet collecting. Training interactions is a great example.
One of my greatest regrets professionally (and I am not complaining - I actually have very few) is the opportunity I missed with the first company I started as a faculty member at UBC. The company was called WebCT. WebCT built the first widely-used learning management system for universities and colleges. It was very popular. In fact, we had roughly 14 million students using WebCT in 80 countries on a daily basis. That’s a lot of learners doing a lot of learning. The opportunity that I now realized I missed was that of anonymously collecting data on how those students learned. Because it might be hard to know exactly the kind of data to collect, the goal would be to collect it all and then make sense of it later when we knew better what questions to ask. So - if I had known then what I know now, I would have collected everything - from “macro” data such as assignment, exam and course grades, to “micro” data such as how long are students spending learning the materials, how long do they spend answering a test question, how do they navigate through the learning content, etc. Now - the truth is that we did collect a lot of this data - but it was only on a class-by-class basis for the use of the instructor. It was never centralized, and therefore no large-scale analysis was ever done.
Think of the potential had things been different. If only I was a little bit smarter, years ago we would have had a very deep and very wide pool of data from which we could derive insights - especially now that computer CPUs have become sufficiently fast and memory sufficiently large to be able to process these large data sets. So what questions might we ask of this data? The most important thing to realize is that it is not necessary to know all the questions now. Some will occur to us now, and we will think of many others later (some of them informed by the answers to our early questions). But just to provide some examples - think about the following:
- We could examine the data in order to try and correlate training outcomes with habits of the individual learner. For example, we would probably not be surprised to find that in general, trainees who spend more time training do better. Or do they? Perhaps we would find other completely unexpected trainee patterns or attributes that would have never occurred to us, but that are good predictors of training outcomes and could therefore be very valuable in helping guide the next batch of trainees?
- We could look at learning patterns for some particular knowledge to see if there were any common paths from “not having” to “having” that knowledge. We might find that there are a few well defined learning paths through the content, and that trainees of a particular background who followed one path did much better than those who followed one of the other paths. This could inform how we train the next person who shares that background.
- We could take this further and reexamine knowledge or competencies months or years after the learning occurred - and then correlate them back to how the trainee acquired that knowledge in the first place. This may yield insights into how to improve knowledge retention.
The bottom line is that big data, if we collect it and analyse it correctly, has the potential to let us peer deeply into the (until now) invisible universe of exactly what is successful and what is not in training. No more having to guess. No more relying only on opinion, comparatively small research studies or our own experience. These are all valuable, but so much more is starting to become possible.
There are many potential sources of big data in maritime training. The ones that come to mind now are simulation training and assessment, online learning, competency management systems, and the recording of vessel operations. Each of these has the opportunity to generate and save observations about seafarers and their performance. More will emerge over time. As before, put aside for the moment thoughts about privacy issues and data ownership (which must not be ignored), and consider the potential.
Simulation training presents a tremendous opportunity to collect reams of data (both macro and micro) for analysis. How can simulators best be used to improve navigation performance? Big data could let us collect information from every hour spent on simulators around the world and analyse how each particular form of use relates to performance gains. How do these performance gains relate to existing knowledge, experience, or learning taking pace in other venues such as via eLearning, or to those recorded in competency management systems? What are the most common mistakes, and how are they correlated to either preceding or following events? Are these performance gains increased with experience, are they made worse with time since the last related training event, and how can this information be used to improve training outcomes? Big data can give us many of these insights.
Similar observation can be recorded, and similar questions asked of those observations when it comes to eLearning and actual vessel navigation. Additionally in the case of vessel navigation, we could add real time recordings of weather, equipment, noise, conversations, and so on to determine how each of these impacts performance and safety. And as above, all of this can be correlated to other training experiences, competencies, assessments, knowledge, and seatime. Imagine how much we could learn about what makes for safe operations, giving us the ability to target changes in training and operations where they will make the largest impact. It is as though we would be shining a light on knowledge which we otherwise have to mostly guess at. We know there are some tremendously accomplished seafarers, and intuition gives us an idea of what makes them so. Big data has the potential to provide concrete data on what makes them so, and may give us the tools necessary to make so many more like them.
As I say, we only have begun to consider the kinds of questions we can ask, and the depth of results that we will derive. We truly are at the beginning of this, but the early potential is starting to come into focus. And now that the spotlight has been shone on big data, its use and the insights derived from it will accelerate rapidly.
Although this article is about the potential of big data for maritime training, no discussion on big data produced by or collected about humans would be complete without a discussion of the privacy and data ownership implications. These are not small issues, but they are not insurmountable. I will leave this topic largely for another discussion (though am happy to write about it if people are interested), but there are precedents for this.
For example, some systems carefully anonymize the data to ensure that it could never be identified as having come from a particular person. It actually turns out that it is difficult to fully anonymize data, so while this is possible, it must be approached carefully. Other systems collect data, but explicitly ask the person who is the source of that data for permission to do so, indicating exactly what that data will be used for, and who will have access to it. In those cases, the actual data ownership can be retained by the person who the data is “about”.
As I say, this is an important topic and cannot be ignored. But I will keep the focus of this article on maritime training, and therefore leave the discussion of privacy and data ownership for another time.
Big data is an emerging field made possible by the rapidly increasing ability of computers to collect and analyze huge volumes of data, and by the continually increasing integration of computers in all aspects of our lives. Even though it will be left primarily to education researchers to determine how to best collect, analyse and draw conclusions from big data as it applies to maritime training, this is an area that is sure to affect the work of all maritime trainers increasingly in the near future. My belief is that big data could easily be the single largest driver of training improvement since learning began - bigger even than the advent of educational technologies.
Thanks for reading, and if you have not already done so, please feel free to click here to sign up to receive notifications of future maritime training articles. Have a great day.
# # #
About The Author:
Murray Goldberg is the founder and President of Marine Learning Systems (www.marinels.com), the creator of MarineLMS - the learning management system designed specifically for maritime industry training. Murray began research in eLearning in 1995 as a faculty member of Computer Science at the University of British Columbia. He went on to create WebCT, the world’s first commercially successful LMS for higher education; serving 14 million students in 80 countries. Murray has won over a dozen University, National and International awards for teaching excellence and his pioneering contributions to the field of educational technology. Now, in Marine Learning Systems, Murray is hoping to play a part in advancing the art and science of learning in the maritime industry.
Maritime Mentoring: International Maritime Mentoring Community - Find a Mentor, Be a Mentor