Some concepts you'll need to know
Every new approach to problems invariably has its own terminology. Machine Learning is no different, but the base set of terminology is thankfully not too large. You do not have to know any of this in depth. For general purposes, it helps to align to a common set of terms.
So you only provide a relatively small set of training data to teach the system.
Example data for some of business problems and their training data set:
- Classify customer input - text containing the input and the actual result
- Classify negative feedback - text containing the input and the actual result
- Classify type of part on the conveyor line - images of various part types and what those parts are
- Classify how well a machine is running - set of sounds for machine running well, and for machine not running well
- Predict if data center is running well - a set of metrics from around the data center like utilization, heat, network traffic
- Predict the price of a house given 100 factors - those 100 factors and the actual house price for some set of houses
Problems we help you with
Identify 'what is this object or picture or piece'. An image model can have many classifications of answers. For example, "is this a happy person, sad, angry, thoughtful, distressed", "does this part look acceptable", "is this fruit ripe or not ripe", "what type of butterfly is this" "what model year is this car".. Or "does this text string mean they have issues with web, customer service, shipping, ..."
Identify from an audio recording 'what am I hearing'.... there can be many classes of answers. For example, is this a "the number 1, the number 2, ...", "is this a jackhammer, is this a truck, is this a bird ...", "does the machine that making parts sound OK". Anything that makes a sound can be classified.
Identify 'what does this text mean'. A machine learning text model can have many classifications of answers. "Is it spam", "is a positive review or negative or in between", "is customer reporting a problem and if so what department shoudl receive this", "does this email mean the customer has issues with web, customer service, or shipping".
Binary classification is used to determine if the new data is 'in' or 'out' of the trained model. Typically used when there is a large amount of good data (nuclear power plant readings, network health checks and metrics, call center metrics, fraud detection, ...) but very little data outside the good band of data. The data points are all numeric in CSV format just like a spreadsheet.
Provide up to a boatload of related and unrelated numeric data (aka: features) then, the machine learning model will predict the numeric result.
For example: for a house price predictor, you might provide features like rooms, square footage, garage size, age of house, .... (sky is the limit) then, the model will predict the price of any house.
Or the clssification in other examples might be a boolean like true or false. For example, you might want to predict that person will "like or not like a movie" given their demographics (again - sky is the limit) and preference information. Just as a note, in this case, we could also tell you the probability that a user would "like or not like a movie".
If you need one model to feed another use 'chains'. For example, say you have a variety of parts coming down a conveyor and you need to determine good/bad for each part type. As opposed to one mammoth model, you can chain together models so model 1 determines a result then feeds in to another based on that result.
For this example, the first model is the part identifier. Which then feeds in to second set of models that is the good/bad identifier - one model per part.
This keeps your models simpler and seperated.
How Capice works
We want to present you what a typical interaction look like, it's a simple pattern you will get used to in minutes
1 - UploadStep 1 of 3
To make data input easy, we provide you with templates. If you will be training linear regression or text models, then use this link to download the simple CSV template file (open it with Excel or Mac Numbers). For audio and image models, you do not have to do this. In that case, just put your training data in directories where each directory name is a classification name, then create a zip file of those directories.
This is not always required, so it may be an infrequent operation for you
2 - Create your modelStep 2 of 3
This is the meat and potatoes of machine learning. And is easy as clicking the Train button. It can take several minutes to create a model. It executes a lot of statistics and math to generate a new model for you. We will let you know when your model is finished on the Manage My Models tab.
This is a more frequent operation, as you identify and solve new business problems.
3 - Try it out!Step 3 of 3
The cool part. Now, the machine learning model is ready to use through the Capice web application, Capice iPhone application or connecting trough the Capice APIs from one of your applications. You can also make your models public for others to use. That will share the model but, the data that created the model will never be shared with others.
This will be the most frequent operation of all.
Ready to get started?