Study notes - AI for everyone

這不是講技術細節的視頻,因為是 for everyone 啊,但是仍然讓我明白了曾經不太搞得清楚的幾個觀點:

  1. ANI 和 AGI 各自的不同
  2. data science project 和 ML project 各自的責任和 output
  3. 一個ML/AI pipeline 的步驟

AI for everyone

AI for everyone

又开始看 AI 了,这次又是Andrew Ng的课,但是不是讲纯理论知识,而是讲如何做AI的项目课程。希望学了以后有启发。

ANI and AGI

Artificial narrow intelligence, e.g. self-driving car, smart speaker, AI in farming. those specific thing robot can do better.

Artificial general intelligence, do anything human can do.

ANI is progressing well, while AGI has no abvious progress.

ML and supervised learning

鬼打墙,2016和2017年都学过一点ML,现在又继续。

Neurual network and deep learning

Neurual network is better to be called Artificial Neurual Network, because it has nothing to do with human neurual network.

Basically neurual network and deep learning are used interchangable. They mean essentially the same thing.

What makes a good AI company

Shopping mall + webside != Internet company 这个是很好的说法,不是做个网站卖东西,就变成了互联网公司。哈哈。

So what makes a good Internet company?

  • A/B Testing (真不敢相信这是第一条,当然只是他的意见咯)
  • Short iteration time
  • Decision making pushed down to engineers and other specialised roles.

So what makes a good AI company?

  • Strategic data acquisition
  • Unified data warehouse
  • Pervasive automation
  • New role e.g. Machine learning engineer and division of labor

这里每一条都要有还要做得好,才是好的。

AI transformation

  • execute pilot projects to gain momentum
  • build an in-house AI team
  • provide broad AI training, to managers and engineers
  • develop an AI strategy
  • develop internal and external communications

做出第一个pilot项目先。所以还是得先干出来呀拜托。

What ML can do or cannot do

Can do, when

  • Learning a simple concept. The thing that human can do less than 1 sec. e.g. tell whether it is a car
  • lots of data available

Cannot do, learn complex concept from small amount of data. e.g. analyse the market and write a 30 pages report. A human rises his/her left hand, let machine find out what’s the intension of that person.

Building an AI project

问题:

  • 和做一般软件项目的差异是什么

Workflow of a machine learning project

Key steps of a machine learning project

Alexa:

  1. collect data
  2. train model - interate many time
  3. deploy model - get more data back, maintain/update model

Self-driving car

  1. collect data: images and position of other cars in images (这是我以前漏掉的,就是label正确的结果,如果只有大量的图片,是没有用的)
  2. train model: train model直到它能准确分析出车在哪里
  3. deploy model: 发布你的车,然后获得更多图片

问题:就车上的处理器,搭载着model,然后就可以准确分析出结果了吗?如果处理器不够,是不是要internet,链接回更强大的服务器,处理然后返回结果,会不会太慢,要不要至少5G?

Workflow of a data science project

The output of a data science project is often a set of actionable insight, which make you to do things differently.

Optimizing a sales funnel

  1. collect data - people from different country, go to different page
  2. analyze the data - why people oversea do not checkout, does it because of roughly estimated high shipping fee?
  3. suggest hypotheses and actions - give more accurate shipping fee

这是我以前混淆的概念,data science 和 machine learning 项目有各自不同的目的和不同的 output。

Every job function needs to learn how to use data

DS: Data helps farmers to decide which crops to plant to maintain better soil condition.

ML: Take a picture of a weed in the fields, and spay the weed killer just on the weed. This is the ML technology help farmer maintain better yields.

How to choose an AI project?

Brainstorm framework:

  • think about automating tasks rather than automating jobs
  • what are the main drivers of business value
  • what are the main pain points in your business
  • can start with a small set of data

Dual diligence,

  1. techinical diligence: can AI meet the requirement, how much data needed, engineering timeline
  2. business diligence: valuable for your business, lower cost, increase revenue, launch new product

also: ethical diligence

Build or Buy

ML projects can be in-house or outsourced, since less required with domain knowledge, more about ML knowledge. DS projects are more commonly in-house, since it requires more business domain knowledge, team should have deep understanding of own business data.

Some thins will be industry standard, avoid building those. Build something specific for your own business.

Working with AI team

AI team expects traning data and test data.

The expectation of output should not be 100% accuracy.

  • limitation of ML
  • insufficient data
  • mislabeled data
  • ambiguous label

Notes, data alwasy means labeled data.

这个课程果真是给非ML engineer的,像是给manager或者business owner的课程。

Tech tool for AI

Machine learning tools: TensorFlow, PyTorch, Keras, MXNet, CNTK, Caffe, PaddlePaddle, Scikit-learn, R?, Weka

Research publications: Arxiv

GPU is playing big role when process deep learning.

Cloud vs. On-Premises: On-premises is running on own computer or own company.

Edge deployment Put data and processor together, and get the result. For example, a self-driving car need to collect data and process result immediately, there is no efficient network for it to send back data to cloud and process and then give it back to car. So the processor and data are all in the same car.

Build AI in your companey

Case study of complex AI products

Smart speaker: steps to process the command

  1. Trigger work/wakeword detection - audio A -> B whether is a trigger
  2. Speech recognition - audio -> text
  3. Intent recognition - analyze the text -> what the intention
  4. Execute a joke - randomly pick a joke and play it out

The entire steps are called AI pipeline. It is not uncommon in one company, each of the step (component) is a team.

Self-driving Car:

  1. image/rador/lidar, GPS / map
  2. car detection / pedestrains detection -> supervised learning add: trajectory prediction
  3. motion planning -> output is the path and speed
  4. Steer/accelerate/brake

Roles in an AI team

Software engineer - work in the steps of 4 mentioned above (Execute a joke, ensure self-driving reliability)

Machine learning engineer - generating A->B mapping, gather data, training mode

Machine learning researcher - extend state-of-the-art in ML

Applied ML Scientist - betweenn above two

Data scientist - examine data and provide insights, make presentation to team/executive, drive business decision making

Data engineer - orgnise data, make sure data is stored in an easily accessible, secure and cost effective way

AI Product manager - help decide what to build, what’s feasible and valuable

Get start with a small team, could be 1 SE, 1 MLE/DS.

AI Transformation playbook

  1. execute pilot project
    • more important for initial project to succeed rather than be the most valuable
    • show tractionn within 6-12 month
    • can be in-house or outsourced
  2. build an in-house ai team
    • let ai expert work in business unit
  3. provide board ai training
    • executive and senior leaders - what AI can do for business; AI strategy; resource allocation
    • leaders of divisions working on ai projects - set project direction; monitor progress; resource allocation
    • AI engineer - buid and ship ai software; gather data…
  4. develop an ai strategy
    • leverage ai to create advantage specific to your industry sector
    • design strategy aligned with the “virtuous cycle of ai”
    • strategic data acquisition/unified data warehouse
  5. develop internal and external communications
    • investor relations
    • gov relations
    • consumer education
    • talent/recruitment
    • internal communications

AI pitfall to avoid

Don’t

  • expect AI to solve everything
  • hire 2-3 Ml engineers and count all on them
  • AI project works at the first time
  • traditional plan works for AI team

Do

  • realistic about what AI can and cannot do given limitaions
  • interatve process
  • establish their own process

First AI Product

Initial steps:

  • get friends to study this course
  • reading group
  • brainstorming projects
  • hire ML/DS to help
  • discuss with CEO or board of AI tansformation

Major AI application areas

Computer vision:

  • image classification, object recognition
    • face recognition, learnt from the old image, and given a new image, tell whether they are the same person
  • object detection : tell apart different objects in one picture, and where
  • image segmentation, one step further, it can tell every pixel on the image that what and where is that pixel belongs to, a car or a pedestrains
    • reading a X-ray scan
  • tracking: tracking different people where they are moving

Natural Language processing

  • text classification: tell a email is a spam or not
    • sentiment recognition: “the food was good” -> means 4 stars; Grammerly 最近出了一个情绪emoji plugin,就是从输入的文本里面读出语气(有一个疑问,这真的需要ML吗)
  • information retrieval
    • web search
  • name entity recognition: recognse of people name, place name, company name
  • machine translation
  • parsing, part-of-speech tagging: tag the nouns, determins, preposition in the sentence; it is normally not a final user product, but a common AI step to help other ai algorithms

volatile