Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
DayF core
Decision at your Fingertips
An e2its product
KeyFactors
• Easy to use: visual design and on-line-help.
• Reliable: getting best objectives.
• Minor learning needs: addressing what do you want to do and
where is objective data platform selects for you the best model and algorithm.
• Trust: auditing your experiments on the same platform as you are working.
• Agility: when you need, where you need.
• Pricing: similar to other common features.
Whatisthemainproblem?
• Data and modern techniques like Advanced
Analytics are the basis for survival in the new scenario ...
• but the domain experts are not able to
exploit that huge flow of information ...
• but neither experts not developers
understand these models and don’t value these
technics and what they are able to add to business …
• and nowadays business units find it difficult to
trace improvements …..
• and it actually is an expensive investment.
Whataretheneeds?
• Forget the complexity of new technologies
lifecycle
• Adapt language to well-know concepts.
• Reports and information to be trust with results.
• Integration capabilities with current operating
platforms, tools and polices.
• Pricing accommodated to different business needs.
Automate your experience …
SavingsonAdvanceAnalyticalProject
lifecycle
Domain analysis Data source discovering
Dimension reduction.
Predictive capabilities.
Looking for compatible
models
Normalization Proofing
Generalizing Pre-production stage Production stage
HowdoesDayFcorework?
Modelling
Analysing and tracing
Consume
Datasheet Automatic selection Execution Automatic Tuning Leaderboard
Experiment
Experiment
Datasheet Predictions
and improvements
Analyze
On-demand
On-premise
DayFKeyFeatures • Only need a datasheet and an objective column …. and your domain expert knowledge.
• No previous knowledge needed about Machine Learning.
• 5 working modes and 4 performance metrics for choosing to avoid overfitting.
• 100% based on parameters for configuration issues.
• Automatic algorithms selection:
• Decision Trees
• Probabilistic
• Linear
• Anomalies (supervised / unsupervised)
• Clustering (K-Means)
• Automatic data normalization:
• Missing data
• Analysis improvement
• Automatic full information storage based on 3 different engines:
• mongoDB
• hdfs
• Local Filesystem
• Technically independent for future changes on Machine Learning base framework.
Technologies
Base Technology: Python 3.6 and pandas
Integrated ML Framework: H2O.ai and Apache Spark
H2O.ai algorithms: GradientBoosting, RandomForest, NaiveBayes, GLM, K-Means, Autoencoders,
DeepLearning (ANN)
Apache Spark 2.2 Algorithms over ml library: GradientBoosting, RandomForest, NaiveBayes, GLM,
K-Means, BisectingKMeans, DecisionTree, Linear regression, LinearSVC, LogisticRegression
Normalizer
Adviser
Input Handler
H2O.aiEngine
Tensorflow
Engine*
ApacheSpark
Engine
Storage Logging
Analysis stageWEB – UI *
Howdoesananalysiswork?
1
2
3
1
4
5
6
7
8 8 * Development stage
Controller (API)
Metadata Config
Controller (REST) *
Workflow (API) Workflow (REST) *
Normalizer
Adviser
Input Handler
Storage Logging
H2O.aiEngine
Tensorflow
Engine*
ApacheSpark
Engine
Prediction stageHowdoesapredictionwork?
2 3
5
4
7
6
WEB – UI *
11 8 8 * Development stage
Controller (API)
Metadata Config
Controller (REST) *
Workflow (API) Workflow (REST) *
API:gDayFcore
controller.exec_analysis:
datapath/dataframe
Objective column
execution mode
performance metric(optional)
analysis deepness(optional)
status, recomendations = controller.exec_analysis(datapath=''.join(source_data),
objective_column='Weather_Temperature',
amode=FAST, metric='rmse', deep_impact=5)
controller.log_model_list:
analysis identificator
model descriptor’s list
sorting metric(optional)
controller.log_model_list(recomendations[0]['model_id'], recomendations, metric='combined', accuracy=True)
Multi-model
Analysis
Print
LeaderBoard
controller.save_models:
model descriptor’s list
saving option
[BEST, BEST_3, ALL, EACH_BEST]
controller.save_models(recomendations, mode=EACH_BEST)
controller.reconstruct_execution_tree:
lista de descriptores de modelos
métrica de ordenación
store: True/False
identificador de usuario
descriptor del experimento
execution_tree = controller.reconstruct_execution_tree(arlist=None, metric='rmse', store=False,
user=controller.user_id,
experiment=recomendations[0]['model_id'])
API:gDayFcore
Saving models
Execution tree
controller.sexec_analysis:
datapath/dataframe
model descriptor’s list
performance metric(optional)
Analysis deepness(optional)
status, recomendations2 = controller.exec_sanalysis(datapath=''.join(source_data),
list_ar_metadata=recomendations[-3:-2],
metric='rmse', deep_impact=3)
controller.remove_models(recomendations, mode=ALL)
controller.remove_models:
model descriptor’s list
dropping mode
[BEST, BEST_3, ALL, EACH_BEST]
API:gDayFcore
Self-service
analysis
Remove
models
controller.exec_prediction:
datapath/dataframe
execution model descriptor or
filepath to execution model
descriptor
prediction_frame = controller.exec_prediction(datapath=''.join(source_data),
model_file=recomendations[0]['json_path'][0]['value'])
controller.get_java_model:
execution model descriptor
type : [pojo, mojo]
# Save Pojo
result = controller.get_java_model(recomendations[0], 'pojo')
print(result)
# Save Mojo
result = controller.get_java_model(recomendations[0], 'mojo')
print(result)
API:gDayFcore
Making
predictions
Java
standalone
component
API:gDayFcore
Leaderboard
table
Workflow
execution
workflow.workflow:
datapath/dataframe
workflow.json
controller.table_model_list:
analysis identificator
model descriptor’s list
sorting metric(optional)
Workflow.json
Config.json
Enterpreneur
Profiling
Jose Luis Sánchez del Coso
Computer Science Master at Granada University
+20 years working in IT
+10 years working in Operations Management
+8 years working in Senior Consultancy
Scrum SCF by SCRUMStudy
ITIL© Expert by EXIN
PMP© by PMI
Oracle© Database Professional by Oracle
Managing solutions for Big Companies
DayFane2itsproduct
Welcome DayF’s world
Thank you!

More Related Content

Presentacion day f-core v1.2.1.2-technical - english

  • 1. DayF core Decision at your Fingertips An e2its product
  • 2. KeyFactors • Easy to use: visual design and on-line-help. • Reliable: getting best objectives. • Minor learning needs: addressing what do you want to do and where is objective data platform selects for you the best model and algorithm. • Trust: auditing your experiments on the same platform as you are working. • Agility: when you need, where you need. • Pricing: similar to other common features.
  • 3. Whatisthemainproblem? • Data and modern techniques like Advanced Analytics are the basis for survival in the new scenario ... • but the domain experts are not able to exploit that huge flow of information ... • but neither experts not developers understand these models and don’t value these technics and what they are able to add to business … • and nowadays business units find it difficult to trace improvements ….. • and it actually is an expensive investment.
  • 4. Whataretheneeds? • Forget the complexity of new technologies lifecycle • Adapt language to well-know concepts. • Reports and information to be trust with results. • Integration capabilities with current operating platforms, tools and polices. • Pricing accommodated to different business needs. Automate your experience …
  • 5. SavingsonAdvanceAnalyticalProject lifecycle Domain analysis Data source discovering Dimension reduction. Predictive capabilities. Looking for compatible models Normalization Proofing Generalizing Pre-production stage Production stage
  • 6. HowdoesDayFcorework? Modelling Analysing and tracing Consume Datasheet Automatic selection Execution Automatic Tuning Leaderboard Experiment Experiment Datasheet Predictions and improvements Analyze On-demand On-premise
  • 7. DayFKeyFeatures • Only need a datasheet and an objective column …. and your domain expert knowledge. • No previous knowledge needed about Machine Learning. • 5 working modes and 4 performance metrics for choosing to avoid overfitting. • 100% based on parameters for configuration issues. • Automatic algorithms selection: • Decision Trees • Probabilistic • Linear • Anomalies (supervised / unsupervised) • Clustering (K-Means) • Automatic data normalization: • Missing data • Analysis improvement • Automatic full information storage based on 3 different engines: • mongoDB • hdfs • Local Filesystem • Technically independent for future changes on Machine Learning base framework.
  • 8. Technologies Base Technology: Python 3.6 and pandas Integrated ML Framework: H2O.ai and Apache Spark H2O.ai algorithms: GradientBoosting, RandomForest, NaiveBayes, GLM, K-Means, Autoencoders, DeepLearning (ANN) Apache Spark 2.2 Algorithms over ml library: GradientBoosting, RandomForest, NaiveBayes, GLM, K-Means, BisectingKMeans, DecisionTree, Linear regression, LinearSVC, LogisticRegression
  • 9. Normalizer Adviser Input Handler H2O.aiEngine Tensorflow Engine* ApacheSpark Engine Storage Logging Analysis stageWEB – UI * Howdoesananalysiswork? 1 2 3 1 4 5 6 7 8 8 * Development stage Controller (API) Metadata Config Controller (REST) * Workflow (API) Workflow (REST) *
  • 10. Normalizer Adviser Input Handler Storage Logging H2O.aiEngine Tensorflow Engine* ApacheSpark Engine Prediction stageHowdoesapredictionwork? 2 3 5 4 7 6 WEB – UI * 11 8 8 * Development stage Controller (API) Metadata Config Controller (REST) * Workflow (API) Workflow (REST) *
  • 11. API:gDayFcore controller.exec_analysis: datapath/dataframe Objective column execution mode performance metric(optional) analysis deepness(optional) status, recomendations = controller.exec_analysis(datapath=''.join(source_data), objective_column='Weather_Temperature', amode=FAST, metric='rmse', deep_impact=5) controller.log_model_list: analysis identificator model descriptor’s list sorting metric(optional) controller.log_model_list(recomendations[0]['model_id'], recomendations, metric='combined', accuracy=True) Multi-model Analysis Print LeaderBoard
  • 12. controller.save_models: model descriptor’s list saving option [BEST, BEST_3, ALL, EACH_BEST] controller.save_models(recomendations, mode=EACH_BEST) controller.reconstruct_execution_tree: lista de descriptores de modelos métrica de ordenación store: True/False identificador de usuario descriptor del experimento execution_tree = controller.reconstruct_execution_tree(arlist=None, metric='rmse', store=False, user=controller.user_id, experiment=recomendations[0]['model_id']) API:gDayFcore Saving models Execution tree
  • 13. controller.sexec_analysis: datapath/dataframe model descriptor’s list performance metric(optional) Analysis deepness(optional) status, recomendations2 = controller.exec_sanalysis(datapath=''.join(source_data), list_ar_metadata=recomendations[-3:-2], metric='rmse', deep_impact=3) controller.remove_models(recomendations, mode=ALL) controller.remove_models: model descriptor’s list dropping mode [BEST, BEST_3, ALL, EACH_BEST] API:gDayFcore Self-service analysis Remove models
  • 14. controller.exec_prediction: datapath/dataframe execution model descriptor or filepath to execution model descriptor prediction_frame = controller.exec_prediction(datapath=''.join(source_data), model_file=recomendations[0]['json_path'][0]['value']) controller.get_java_model: execution model descriptor type : [pojo, mojo] # Save Pojo result = controller.get_java_model(recomendations[0], 'pojo') print(result) # Save Mojo result = controller.get_java_model(recomendations[0], 'mojo') print(result) API:gDayFcore Making predictions Java standalone component
  • 18. Enterpreneur Profiling Jose Luis Sánchez del Coso Computer Science Master at Granada University +20 years working in IT +10 years working in Operations Management +8 years working in Senior Consultancy Scrum SCF by SCRUMStudy ITIL© Expert by EXIN PMP© by PMI Oracle© Database Professional by Oracle Managing solutions for Big Companies