Study notes - AI for everyone
這不是講技術細節的視頻,因為是 for everyone 啊,但是仍然讓我明白了曾經不太搞得清楚的幾個觀點:
- ANI 和 AGI 各自的不同
- data science project 和 ML project 各自的責任和 output
- 一個ML/AI pipeline 的步驟
AI for everyone
又开始看 AI 了,这次又是Andrew Ng的课,但是不是讲纯理论知识,而是讲如何做AI的项目课程。希望学了以后有启发。
ANI and AGI
Artificial narrow intelligence, e.g. self-driving car, smart speaker, AI in farming. those specific thing robot can do better.
Artificial general intelligence, do anything human can do.
ANI is progressing well, while AGI has no abvious progress.
ML and supervised learning
鬼打墙,2016和2017年都学过一点ML,现在又继续。
Neurual network and deep learning
Neurual network is better to be called Artificial Neurual Network, because it has nothing to do with human neurual network.
Basically neurual network and deep learning are used interchangable. They mean essentially the same thing.
What makes a good AI company
Shopping mall + webside != Internet company 这个是很好的说法,不是做个网站卖东西,就变成了互联网公司。哈哈。
So what makes a good Internet company?
- A/B Testing (真不敢相信这是第一条,当然只是他的意见咯)
- Short iteration time
- Decision making pushed down to engineers and other specialised roles.
So what makes a good AI company?
- Strategic data acquisition
- Unified data warehouse
- Pervasive automation
- New role e.g. Machine learning engineer and division of labor
这里每一条都要有还要做得好,才是好的。
AI transformation
- execute pilot projects to gain momentum
- build an in-house AI team
- provide broad AI training, to managers and engineers
- develop an AI strategy
- develop internal and external communications
做出第一个pilot项目先。所以还是得先干出来呀拜托。
What ML can do or cannot do
Can do, when
- Learning a simple concept. The thing that human can do less than 1 sec. e.g. tell whether it is a car
- lots of data available
Cannot do, learn complex concept from small amount of data. e.g. analyse the market and write a 30 pages report. A human rises his/her left hand, let machine find out what’s the intension of that person.
Building an AI project
问题:
- 和做一般软件项目的差异是什么
Workflow of a machine learning project
Key steps of a machine learning project
Alexa:
- collect data
- train model - interate many time
- deploy model - get more data back, maintain/update model
Self-driving car
- collect data: images and position of other cars in images (这是我以前漏掉的,就是label正确的结果,如果只有大量的图片,是没有用的)
- train model: train model直到它能准确分析出车在哪里
- deploy model: 发布你的车,然后获得更多图片
问题:就车上的处理器,搭载着model,然后就可以准确分析出结果了吗?如果处理器不够,是不是要internet,链接回更强大的服务器,处理然后返回结果,会不会太慢,要不要至少5G?
Workflow of a data science project
The output of a data science project is often a set of actionable insight, which make you to do things differently.
Optimizing a sales funnel
- collect data - people from different country, go to different page
- analyze the data - why people oversea do not checkout, does it because of roughly estimated high shipping fee?
- suggest hypotheses and actions - give more accurate shipping fee
这是我以前混淆的概念,data science 和 machine learning 项目有各自不同的目的和不同的 output。
Every job function needs to learn how to use data
DS: Data helps farmers to decide which crops to plant to maintain better soil condition.
ML: Take a picture of a weed in the fields, and spay the weed killer just on the weed. This is the ML technology help farmer maintain better yields.
How to choose an AI project?
Brainstorm framework:
- think about automating tasks rather than automating jobs
- what are the main drivers of business value
- what are the main pain points in your business
- can start with a small set of data
Dual diligence,
- techinical diligence: can AI meet the requirement, how much data needed, engineering timeline
- business diligence: valuable for your business, lower cost, increase revenue, launch new product
also: ethical diligence
Build or Buy
ML projects
can be in-house or outsourced, since less required with domain knowledge, more about ML knowledge.
DS projects
are more commonly in-house, since it requires more business domain knowledge, team should have deep understanding of own business data.
Some thins will be industry standard, avoid building those. Build something specific for your own business.
Working with AI team
AI team expects traning data and test data.
The expectation of output should not be 100% accuracy.
- limitation of ML
- insufficient data
- mislabeled data
- ambiguous label
Notes, data alwasy means labeled data.
这个课程果真是给非ML engineer的,像是给manager或者business owner的课程。
Tech tool for AI
Machine learning tools
:
TensorFlow, PyTorch, Keras, MXNet, CNTK, Caffe, PaddlePaddle, Scikit-learn, R?, Weka
Research publications
:
Arxiv
GPU
is playing big role when process deep learning.
Cloud vs. On-Premises
:
On-premises is running on own computer or own company.
Edge deployment
Put data and processor together, and get the result. For example, a self-driving car need to collect data and process result immediately, there is no efficient network for it to send back data to cloud and process and then give it back to car. So the processor and data are all in the same car.
Build AI in your companey
Case study of complex AI products
Smart speaker:
steps to process the command
- Trigger work/wakeword detection - audio A -> B whether is a trigger
- Speech recognition - audio -> text
- Intent recognition - analyze the text -> what the intention
- Execute a joke - randomly pick a joke and play it out
The entire steps are called AI pipeline. It is not uncommon in one company, each of the step (component) is a team.
Self-driving Car:
- image/rador/lidar, GPS / map
- car detection / pedestrains detection -> supervised learning add: trajectory prediction
- motion planning -> output is the path and speed
- Steer/accelerate/brake
Roles in an AI team
Software engineer
- work in the steps of 4 mentioned above (Execute a joke, ensure self-driving reliability)
Machine learning engineer
- generating A->B mapping, gather data, training mode
Machine learning researcher
- extend state-of-the-art in ML
Applied ML Scientist
- betweenn above two
Data scientist
- examine data and provide insights, make presentation to team/executive, drive business decision making
Data engineer
- orgnise data, make sure data is stored in an easily accessible, secure and cost effective way
AI Product manager
- help decide what to build, what’s feasible and valuable
Get start with a small team, could be 1 SE, 1 MLE/DS.
AI Transformation playbook
- execute pilot project
- more important for initial project to succeed rather than be the most valuable
- show tractionn within 6-12 month
- can be in-house or outsourced
- build an in-house ai team
- let ai expert work in business unit
- provide board ai training
- executive and senior leaders - what AI can do for business; AI strategy; resource allocation
- leaders of divisions working on ai projects - set project direction; monitor progress; resource allocation
- AI engineer - buid and ship ai software; gather data…
- develop an ai strategy
- leverage ai to create advantage specific to your industry sector
- design strategy aligned with the “virtuous cycle of ai”
- strategic data acquisition/unified data warehouse
- develop internal and external communications
- investor relations
- gov relations
- consumer education
- talent/recruitment
- internal communications
AI pitfall to avoid
Don’t
- expect AI to solve everything
- hire 2-3 Ml engineers and count all on them
- AI project works at the first time
- traditional plan works for AI team
Do
- realistic about what AI can and cannot do given limitaions
- interatve process
- establish their own process
First AI Product
Initial steps:
- get friends to study this course
- reading group
- brainstorming projects
- hire ML/DS to help
- discuss with CEO or board of AI tansformation
Major AI application areas
Computer vision:
image classification, object recognition
- face recognition, learnt from the old image, and given a new image, tell whether they are the same person
object detection
: tell apart different objects in one picture, and whereimage segmentation
, one step further, it can tell every pixel on the image that what and where is that pixel belongs to, a car or a pedestrains- reading a X-ray scan
tracking
: tracking different people where they are moving
Natural Language processing
text classification
: tell a email is a spam or not- sentiment recognition: “the food was good” -> means 4 stars; Grammerly 最近出了一个情绪emoji plugin,就是从输入的文本里面读出语气(有一个疑问,这真的需要ML吗)
information retrieval
- web search
name entity recognition
: recognse of people name, place name, company namemachine translation
parsing, part-of-speech tagging
: tag the nouns, determins, preposition in the sentence; it is normally not a final user product, but a common AI step to help other ai algorithms