物流中的人工智能(英文版).pdf
Powered by DHL Trend ResearchARTIFICIAL INTELLIGENCE IN LOGISTICSA collaborative report by DHL and IBM on implications and use cases for the logistics industry2018PUBLISHERDHL Customer Solutions systems that approximate, mimic, replicate, automate, and eventually improve on human thinking. Throughout the past half-century a few key com-ponents of AI were established as essential: the ability to perceive, understand, learn, problem solve, and reason. Countless working definitions of AI have been proposed over the years but the unifying thread in all of them is 1 UNDERSTANDING ARTIFICIAL INTELLIGENCEUnderstanding Artificial Intelligence 3that computers with the right software can be used to solve the kind of problems that humans solve, interact with humans and the world as humans do, and create ideas like humans. In other words, while the mechanisms that give rise to AI are artificial, the intelligence to which AI is intended to approximate is indistinguishable from human intelligence. In the early days of the science, pro-cessing inputs from the outside world required extensive programming, which limited early AI systems to a very narrow set of inputs and conditions. However since then, computer science has worked to advance the capability of AI-enabled computing systems.Board games have long been a proving ground for AI research, as they typically involve a finite number of players, rules, objectives, and possible moves. This essen-tially means that games one by one, including checkers, backgammon, and even Jeopardy! to name a few have been taken over by AI. Most famously, in 1997 IBMs Deep Blue defeated Garry Kasparov, the then reigning world champion of chess. This trajectory persists with the ancient Chinese game of Go, and the defeat of reigning world champion Lee Sedol by DeepMinds AlphaGo in March 2016.Figure 1: An AI timeline; Source: Lavenda, D. / Marsden, P.AI is born Focus on specific intelligence Focus on specific problemsThe Turing TestDartmouth College conferenceInformation theory-digital signalsSymbolic reasoningExpert systems Source: Nvidia1950s 1960s 1990s 2010s2000s1980s1970sAI, MACHINE LEARNING Source: Getty ImagesSedols defeat was a watershed moment for the prowess of AI technology. Previous successes had depended on what could be called a brute force approach; systems learned well-structured rules of the game, mastered all possible moves, and then programmatically decided the best move at machine speed, which is considerably faster than human decision making. In a traditional Go board of 19 by 19 lines, there are more possible combinations than the number of atoms on planet earth, meaning it is impossible for any computing system available today to master each move. DeepMinds AlphaGo effectively had to develop a sense of reasoning, strategy, and intuition to defeat Sedol; something that Go players have tirelessly tried to perfect for over 2,500 years yet DeepMind trained AlphaGo to do in a matter of months. The important outcome from Sedols defeat is not that DeepMinds AI can learn to conquer Go, but that by extension it can learn to conquer anything easier than Go which amounts to a vast number of things.1Current understanding of AI can quickly become convoluted with a dizzying array of complex technical terms and buzz-words common to mainstream media and publications on the topic today. Two terms in particular are important in understanding AI machine learning which is a subset of AI and deep learning which is a subset of machine learning, as depicted in figure 3.Whereas AI is a system or device intended to act with intelligence, machine learning is a more specific term that refers to systems that are designed to take in information, Understanding Artificial Intelligence 5Figure 4: A diagram of a neural network with six inputs, seven tuning parameters, and a single output; Source: Nielsen, M.DIAGRAM OF A NEURAL NETWORKINPUT LAYERHIDDEN LAYERSOUTPUT LAYERProblemTypeImage RecognitionLoanApprovalOnline Ad PlacementInputsPicture(s)LoanapplicationSocial media profile, browsing historyHidden LayersPerson? Face? Gender? Age? Hair the intention of the system is to learn from the real world and adjust the learning model as it takes in new informa-tion and forms new insights.In simplified form, figure 5 depicts how deep learning algorithms can distinguish the content of an image, as well as where the elements of the image are in relation to one another, by analyzing pixel data alone. The human visual cortex is constantly doing this without our conscious awareness; however this perceptive ability in computers is truly novel. This is the type of system that is more useful in addressing real-world data challenges, which is why deep learning systems are the ones that have been directed at extremely large and fast-moving datasets typically found on social media platforms and in autonomous vehicles. Deep learning is typically done with neural networks. Neural networks are humanity's best attempt to mimic both the structure and function of the human brain. As new data is fed into a neural network, connections between nodes are established, strengthened, or diminished, in a similar fashion to how connections between neurons in the human brain grow stronger through recurring experiences. Furthermore, each connection in a neural network can be tuned, assign-ing greater or lesser importance to an attribute, to achieve the quality of the output.Figure 5: Deep learning goes beyond classifying an image to identify the content of images in relation to one another; Source: StanfordInstance SegmentationObject DetectionClassification + LocalizationClassificationSingle Objects Multiple ObjectsUnderstanding Artificial Intelligence61.2 How Machines Learn: Three Components of AIDespite the oversimplification that tends to define AI in the popular press, AI is not one single, unified technology. AI is actually a set of interrelated technology components that can be used in a wide variety of combinations depending on the problem it addresses. Generally, AI technology consists of sensing components, processing components, and learn-ing components (see figure 6).Sensing: The Fuel of AITo be able to understand or “sense” the real world, AI must take in information. As real-world information comes in many forms, AI must be able to digest text, capture images and video, take in sound, and eventually gather information about environmental conditions such as tem-perature, wind, humidity, etc. everything that is typically understood by humans through our sense of touch.One of the most mature AI sensing capabilities is text-based processing. While AI systems have been processing structured data from databases, spreadsheets, and the internet for many years, recent advances in deep learn-ing have improved AIs ability to process and understand unstructured data. Comments online, in social media, and even within apps are unstructured, so this critical capa-bility dramatically increases the amount and diversity of inputs that AI can leverage to understand the world. Putting it all togetherA FULL AI LEARNING CYCLEFigure 6: A full AI learning cycle; Source: IBM / DHL1. Training data2. Data gathered continuously from the environment, sensors and online behavior3. Data is aggregated and harmonized4. Machine learning framework processes data5. Patterns and trends are revealed, generating insight6. System takes different actions to drive value. New action is used as input to improve self- learning of the system.Text Image Sound MachineSensing Processing LearningMuch of our spoken interaction can now be captured by microphones and made sense of by AI systems. AI systems can consider the context in which spoken words were captured and, with access to large enough datasets containing similar and related phrases, can transform this once unusable data into valuable insight.Advances in speech-to-text technology are significantly enabling voice-driven AI. Today the comprehension ability of AI-driven voice assistants is surpassing that of humans. The key metric for measuring speech-to-text performance is word error rate; effectively, how accurately does a per-son or system translate and interpret the words contained in a given voice sample. In a typical interaction between two people, the average percentage of words misunder-stood by each person is 6%. Today the best AI-driven voice assistants are able to achieve a word error rate of 5%. Figure 7: Smart speakers with AI-driven voice assistants; Source: HeavyUnderstanding Artificial Intelligence 72Chen, F. (2017).3Meeker, M. (2014).4Chou, T. (2016).And as AI-driven voice assistants improve their models and comprehension with each new query (in other words, as they are given new data to learn from) their word error rate continues to fall.2Images are another rich source of insight from unstructured data. It was estimated that even four years ago 1.8 billion images were uploaded to the internet daily, and this num-ber continues to grow.3Fortunately, many AI capabilities have been developed to process information from images. Companies like Google have leveraged this type of AI for years in consumer settings, and an increasing number of companies are deploying static and video-capable systems in their daily operations. As AI continues to get better at turning the vast sea of visual information into system- usable content, the accuracy with which these systems understand our world is also increasing. The Internet of Things (IoT) is already making machine data available for consumption by AI-based systems, often for the first time. IoT involves collecting large heterogeneous datasets from vast fleets of heterogeneous devices, but making sense of this information and learning from it can be a challenging task even for the advanced data analytics tools of today. The computer understands the picture and sees that 'a woman is throwing a frisbee in a park'DEEP LEARNING IN ACTIONFigure 8: The evolution of picture understanding with deep learning; Source: IBM1980s 2012Scanned Digits Natural PhotosVision Source: DHLAI IN THE INTERNET OF THINGS5. ActionNew action taken to drive value by orchestrating assets differently. New action is used as input to improve self-learning of the system.3. Data How data is gathered from the connected devices.1. Things Billions of connected assets equipped with sensors, carrying out various tasks.Internet of ThingsArtificial Intelligence4. InsightWhat an AI model uncovers and learns from patterns within large volumes of complex data.2. ConnectivityHow the devices are connected.Understanding Artificial Intelligence8Figure 10: An overview of machine learning techniques; Source: Jha, V.TAXONOMY OF MACHINE LEARNING METHODOLOGIESGame AIReal-Time DecisionsRobot NavigationLearning TasksSkill AcquisitionIdentity FraudDetectionImage Classif_icationCustomer RetentionDiagnosticsAdvertising PopularityPredictionWeatherForecastingMarketForecastingEstimatingLife ExpectancyPopulationGrowthPredictionClassif_icationRegressionSupervisedLearningReinforcementLearningMeaningfulCompressionStructureDiscoveryBig Data VisualizationRecommenderSystemsTargeted MarketingCustomerSegmentationFeature ElicitationDimensionalityReductionUnsupervisedLearningClusteringMachineLearningProcessing & Learning Components: Frameworks & Training TechniquesOnce an AI system has collected data from sensing, it pro- cesses this information by applying a learning framework to generate insight from the data. In addition to the similarities that exist between human intelligence and AI, strong parallels have also been observed between how humans and AI systems learn. Youngsters tend to learn from their parents and teachers in highly structuredsettings with lots of reinforcement, whereas older, more experienced adult learners are well adapted to seeking their own inputs and learning from the world around them. Similarly, AI systems use supervised, unsupervised, and reinforcement learning to take in and process infor-mation about the world.Supervised learning, as the name suggests, is learning that takes place when an AI-enabled system is directly informed by humans. A doctor who evaluates x-ray images to detect cancer risk, for example, can annotate the images with his or her expert input, then feed them into an AI system to facilitate supervised learning. Another example of super-vised learning is when the AI system sorts through x-ray images for a doctor to review and approve in an effort to help improve the learning of the AI system.Unsupervised learning takes place, when an AI-enabled system is able to identify clusters or dimensions in the data itself without additional guidance from human data or computer scientists. This technique can lead to significantly novel and unexpected results based on the data the system is exposed to. As an example in 2014, unsupervised learn-ing is how YouTube was able to recognize cat faces from uploaded videos, simply by “watching“ 10 million of them with little guidance of the desired output.5Reinforcement learning takes place when the AI system is tasked with not only processing the available inputs but also learning the rules of the game. This is based not on direct human interaction but on the amassed data from environmental responses given to the AI system. An analo- gous example of reinforcement learning is how infants first learn to walk: they first observe others until they find the ability to try it themselves. They then try to walk on their own many times unsuccessfully, but they improve their abilities each time until they can successfully walk unaided. AlphaGo used reinforcement learning techniques in this way by ingesting a large number of completed and in-pro-cess Go games in order to simultaneously figure out how the game works and how to play and beat any competitor.Many different types of machine learning framework exist today, each with its own core functionality of a deep learn-ing capability based on neural networks. Data scientists and software developers need to access this core functionality in order to develop AI solutions, so they must select the frame-work that works best with their defined AI objectives and delivers the required deep learning capability.5Oremus, W. (2012).