Part 1 of a conference workshop. This forms the morning session, which looks at moving from Business Intelligence to Analytics.
Topics Covered: Azure Data Explorer, Azure Data Factory, Azure Synapse Analytics, Event Hubs, HDInsight, Big Data
Part 3 - Modern Data Warehouse with Azure SynapseNilesh Gule
Slide deck of the third part of building Modern Data Warehouse using Azure. This session covered Azure Synapse, formerly SQL Data Warehouse. We look at the Azure Synapse Architecture, external files, integration with Azuer Data Factory.
The recording of the session is available on YouTube
https://www.youtube.com/watch?v=LZlu6_rFzm8&WT.mc_id=DP-MVP-5003170
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Michael Rys
Presentation by James Baker and myself on Running cost effective big data workloads with Azure Synapse and Azure Datalake Storage (ADLS) at Microsoft Ignite 2020. Covers Modern Data warehouse architecture supported by Azure Synapse, integration benefits with ADLS and some features that reduce cost such as Query Acceleration, integration of Spark and SQL processing with integrated meta data and .NET For Apache Spark support.
This document provides an overview of Azure Synapse Analytics, a limitless analytics service that brings together data warehousing and big data analytics capabilities. It discusses how businesses currently have to maintain separate systems for operational data/relational data and big data/semi-structured data, which Azure Synapse addresses by providing a single service for end-to-end analytics using technologies like SQL and Spark across data warehouses and data lakes at cloud scale with unmatched speed.
Azure Databricks—Apache Spark as a Service with Sascha DittmannDatabricks
The driving force behind Apache Spark (Databricks Inc.) and Microsoft have designed a joint service to quickly and easily create Big Data and Advanced Analytics solutions. The combination of the comprehensive Databricks Unified Analytics platform and the powerful capabilities of Microsoft Azure make it easy to analyse data streams or large amounts of data, as well asthe training of AI models. Sascha Dittmann shows in this session how the new Azure service can be set up and used in various real-world scenarios. He also shows, how to connect the various Azure Services to the Azure Databricks service.
Azure Synapse Analytics is Azure SQL Data Warehouse evolved: a limitless analytics service, that brings together enterprise data warehousing and Big Data analytics into a single service. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources, at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs. This is a huge deck with lots of screenshots so you can see exactly how it works.
Build Real-Time Applications with Databricks StreamingDatabricks
This document discusses using Databricks, Spark, and Power BI for real-time data streaming. It describes a use case of a fire department needing real-time reporting of equipment locations, personnel statuses, and active incidents. The solution involves ingesting event data using Azure Event Hubs, processing the stream using Databricks and Spark Structured Streaming, storing the results in Delta Lake, and visualizing the data in Power BI dashboards. It then demonstrates the architecture by walking through creating Delta tables, streaming from Event Hubs to Delta Lake, and running a sample event simulator.
Analytics in a Day Ft. Synapse Virtual WorkshopCCG
Say goodbye to data silos! Analytics in a Day will simplify and accelerate your journey towards the modern data warehouse. Join CCG and Microsoft for a half-day virtual workshop, hosted by James McAuliffe.
The document discusses how companies can use big data analytics and Azure Databricks to improve their customer experiences and grow their business. It provides an overview of how Wide World Importers seeks to expand its customers through an omni-channel strategy using analytics from data across its retail stores, website, and mobile apps. The document also outlines logical architectures for ingesting, storing, preparing, training models on, and serving data using Azure Databricks and other Azure services.
Modern DW Architecture
- The document discusses modern data warehouse architectures using Azure cloud services like Azure Data Lake, Azure Databricks, and Azure Synapse. It covers storage options like ADLS Gen 1 and Gen 2 and data processing tools like Databricks and Synapse. It highlights how to optimize architectures for cost and performance using features like auto-scaling, shutdown, and lifecycle management policies. Finally, it provides a demo of a sample end-to-end data pipeline.
Ai & Data Analytics 2018 - Azure Databricks for data scientistAlberto Diaz Martin
This document summarizes a presentation given by Alberto Diaz Martin on Azure Databricks for data scientists. The presentation covered how Databricks can be used for infrastructure management, data exploration and visualization at scale, reducing time to value through model iterations and integrating various ML tools. It also discussed challenges for data scientists and how Databricks addresses them through features like notebooks, frameworks, and optimized infrastructure for deep learning. Demo sections showed EDA, ML pipelines, model export, and deep learning modeling capabilities in Databricks.
This document discusses designing a modern data warehouse in Azure. It provides an overview of traditional vs. self-service data warehouses and their limitations. It also outlines challenges with current data warehouses around timeliness, flexibility, quality and findability. The document then discusses why organizations need a modern data warehouse based on criteria like customer experience, quality assurance and operational efficiency. It covers various approaches to ingesting, storing, preparing, modeling and serving data on Azure. Finally, it discusses architectures like the lambda architecture and common data models.
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...Microsoft Tech Community
In this session you will learn how to develop data pipelines in Azure Data Factory and build a Cloud-based analytical solution adopting modern data warehouse approaches with Azure SQL Data Warehouse and implementing incremental ETL orchestration at scale. With the multiple sources and types of data available in an enterprise today Azure Data factory enables full integration of data and enables direct storage in Azure SQL Data Warehouse for powerful and high-performance query workloads which drive a majority of enterprise applications and business intelligence applications.
Apache Spark is a fast and general engine for large-scale data processing. It was created by UC Berkeley and is now the dominant framework in big data. Spark can run programs over 100x faster than Hadoop in memory, or more than 10x faster on disk. It supports Scala, Java, Python, and R. Databricks provides a Spark platform on Azure that is optimized for performance and integrates tightly with other Azure services. Key benefits of Databricks on Azure include security, ease of use, data access, high performance, and the ability to solve complex analytics problems.
This document discusses using Azure Data Factory (ADF) for data lake ETL processes in the cloud. It describes how ADF can ingest data from on-premises, cloud, and SaaS sources into a data lake for preparation, transformation, enrichment, and serving to downstream analytics or machine learning processes. The document also provides several links to YouTube videos and articles about using ADF for these tasks.
This document discusses various integration patterns and architectures that involve Microsoft Azure and BizTalk Server. It presents questions that customers may ask about integration solutions. It also provides examples of hybrid integration architectures that leverage Azure services like Service Bus along with on-premises BizTalk Server. The document aims to help customers analyze requirements and evaluate different architectural options for their integration needs.
Data saturday malta - ADX Azure Data Explorer overviewRiccardo Zamana
This is a step-by-step approach the entire ecosystem of features driven by Azure Data eXplorer. You can find many examples using Kusto dialect, in order to acquire data, process and build up complete web interfaces using only one service: ADX.
Slidedeck related to the talk presented at the Manila Data Day event March 2020. The demo covers Azure services like Data Lake Storage (Gen 2), Azure Data Factory, Azure Databricks, Azure Synapse, Key Vault and Active directory to build a modern data warehouse.
Using Redash for SQL Analytics on DatabricksDatabricks
This talk gives a brief overview with a demo performing SQL analytics with Redash and Databricks. We will introduce some of the new features coming as part of our integration with Databricks following the acquisition earlier this year, along with a demo of the other Redash features that enable a productive SQL experience on top of Delta Lake.
The breath and depth of Azure products that fall under the AI and ML umbrella can be difficult to follow. In this presentation I’ll first define exactly what AI, ML, and deep learning is, and then go over the various Microsoft AI and ML products and their use cases.
Microsoft Fabric is the next version of Azure Data Factory, Azure Data Explorer, Azure Synapse Analytics, and Power BI. It brings all of these capabilities together into a single unified analytics platform that goes from the data lake to the business user in a SaaS-like environment. Therefore, the vision of Fabric is to be a one-stop shop for all the analytical needs for every enterprise and one platform for everyone from a citizen developer to a data engineer. Fabric will cover the complete spectrum of services including data movement, data lake, data engineering, data integration and data science, observational analytics, and business intelligence. With Fabric, there is no need to stitch together different services from multiple vendors. Instead, the customer enjoys end-to-end, highly integrated, single offering that is easy to understand, onboard, create and operate.
This is a hugely important new product from Microsoft and I will simplify your understanding of it via a presentation and demo.
Agenda:
What is Microsoft Fabric?
Workspaces and capacities
OneLake
Lakehouse
Data Warehouse
ADF
Power BI / DirectLake
Resources
So you got a handle on what Big Data is and how you can use it to find business value in your data. Now you need an understanding of the Microsoft products that can be used to create a Big Data solution. Microsoft has many pieces of the puzzle and in this presentation I will show how they fit together. How does Microsoft enhance and add value to Big Data? From collecting data, transforming it, storing it, to visualizing it, I will show you Microsoft’s solutions for every step of the way
Big Data Expo 2015 - Microsoft Transform you data into intelligent actionBigDataExpo
Er zijn veel beloftes rondom Big Data. Iedereen praat erover maar hoe begin je zonder meteen een grote business case op te moeten stellen. Cortana Analytics Suite is laagdrempelig en een makkelijk toegankelijk Advanced Analytics platform om je ideeën op haalbaarheid te testen maar daarna ook door te groeien naar (grote) productie implementaties. In deze sessie krijg je een overzicht van de scenario’s die Cortana Analytics biedt. Denk daar bij aan IOT, Machine Learning maar ook Churn Analysis, Forecasting en Predictive Maintenance.
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...Codit
The document discusses how the Azure ecosystem plays a crucial role in IoT solutions. It outlines key Azure services for connecting devices, processing streaming data, implementing business logic, enabling connectivity, and providing insights. These services include IoT Hub for device connectivity, Stream Analytics for real-time analytics, Service Fabric for business logic, Logic Apps for connectivity, and Time Series Insights for streaming insights. The document also presents the Azure IoT reference architecture and recommends starting with preconfigured solutions like IoT Central to get up and running quickly.
IoT - Retour d'expérience de projets clients dans le domaine IoT. Michael Epprecht, Technical Specialist in the Global Black Belt IoT Team at Microsoft. Conférence donnée dans le cadre du Swiss Data Forum, du 24 novembre 2015 à Lausanne
This document discusses the future of data and the Azure data ecosystem. It highlights that by 2025 there will be 175 zettabytes of data in the world and the average person will have over 5,000 digital interactions per day. It promotes Azure services like Power BI, Azure Synapse Analytics, Azure Data Factory and Azure Machine Learning for extracting value from data through analytics, visualization and machine learning. The document provides overviews of key Azure data and analytics services and how they fit together in an end-to-end data platform for business intelligence, artificial intelligence and continuous intelligence applications.
This document provides an overview of Azure Synapse Analytics and its key capabilities. Azure Synapse Analytics is a limitless analytics service that brings together enterprise data warehousing and big data analytics. It allows querying data on-demand or at scale using serverless or provisioned resources. The document outlines Synapse's integrated data platform capabilities for business intelligence, artificial intelligence and continuous intelligence. It also describes the different types of analytics workloads that Synapse supports and key architectural components like the dedicated SQL pool and massively parallel processing concepts.
講師: Ivan Cheng, Solution Architect, AWS
Join us for a series of introductory and technical sessions on AWS Big Data solutions. Gain a thorough understanding of what Amazon Web Services offers across the big data lifecycle and learn architectural best practices for applying those solutions to your projects.
We will kick off this technical seminar in the morning with an introduction to the AWS Big Data platform, including a discussion of popular use cases and reference architectures. In the afternoon, we will deep dive into Machine Learning and Streaming Analytics. We will then walk everyone through building your first Big Data application with AWS.
The document discusses building an end-to-end analytic solution in the cloud using Microsoft Azure tools, including ingesting data from various sources into Azure Data Factory, storing it in Azure Data Lake, transforming the data using U-SQL scripts in Azure Data Lake Analytics, developing predictive models with Azure Machine Learning Studio, and visualizing insights with Power BI. It provides examples of how each tool in the analytic lifecycle can be leveraged as part of an overall cloud-based analytics solution handling large volumes of data.
In this session we will delve into the world of Azure Databricks and analyze why it is becoming a tool for data Scientist and/or fundamental data Engineer in conjunction with Azure services
Azure Machine Learning Services provides an end-to-end, scalable platform for operationalizing machine learning models. It allows users to deploy models everywhere from containers and Kubernetes to SQL Datawarehouse and Cosmos DB. It also offers tools to boost data science productivity, increase experimentation, and automate model retraining. The platform seamlessly integrates with Azure services and is built to deploy models globally at scale with high availability and low latency.
MongoDB IoT City Tour STUTTGART: The Microsoft Azure Platform for IoTMongoDB
Presented by, Dr Christian Geuer-Pollmann, Senior Technology Evangelist at Microsoft.
The presentation gives a solid overview to the Microsoft Azure platform, with a special emphasis on scenarios for IoT workloads. First, Christian provides an introduction to Microsoft Azure’s IaaS compute and networking infrastructure (i.e. virtual machines, virtual networks, load balancers and HA concepts). The second part of the presentation focuses on higher-order services in Azure, such as relational data bases, machine learning, search, and NoSQL offerings. Last, Christian explains how the Azure Service Bus and the Intelligent Systems Services fit into the overall IoT landscape.
Azure provides cloud computing services including computing, analytics, networking, storage, and more. It offers virtual machines, databases, websites, and other services that can be accessed from anywhere and scaled up as needed. Azure aims to provide enterprise-grade services that are economical, scalable, and hybrid-ready to work with existing on-premises systems. It has data centers across the world and over 600,000 servers to provide its services globally at scale.
Comparing Microsoft Big Data Platform TechnologiesJen Stirrup
In this segment, we look at technologies such as HDInsight, Azure Databricks, Azure Data Lake Analytics and Apache Spark. We compare the technologies to help you to decide the best technology for your situation.
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a platform designed to address multi-faceted needs by offering multi-function Data Management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion. They need a worry-free experience with the architecture and its components.
Azure Data Explorer deep dive - review 04.2020Riccardo Zamana
Modern Data Science Lifecycle with ADX & Azure
This document discusses using Azure Data Explorer (ADX) for data science workflows. ADX is a fully managed analytics service for real-time analysis of streaming data. It allows for ad-hoc querying of data using Kusto Query Language (KQL) and integrates with various Azure data ingestion sources. The document provides an overview of the ADX architecture and compares it to other time series databases. It also covers best practices for ingesting data, visualizing results, and automating workflows using tools like Azure Data Factory.
Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics. It provides the freedom to query data at scale using either serverless or dedicated options. Azure HDInsight allows the use of open source frameworks like Hadoop, Spark, Hive, and Kafka for processing large volumes of data. Azure Databricks offers environments for SQL, data science/engineering, and machine learning. The Azure IoT Hub enables scalable IoT solutions by allowing bidirectional communication between IoT applications and connected devices.
Similar to 1 Introduction to Microsoft data platform analytics for release (20)
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
AI Applications in Healthcare and Medicine.pdfJen Stirrup
This session was delivered for the Global Business Roundtable. The topic: AI applications in Healthcare and Medicine. In this session, Jennifer Stirrup takes people through a general process of adopting AI in their organisations.
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATIONJen Stirrup
The objective of Digital Transformation is improve the quality and resilience of digital services to serve customers better, and data is a cruel part of fulfilling that ambition. As the organisation moves forward in pursuit of its strategic ambitions, it will need to remain focused on the stabilisation and improvement of existing technology and data foundations. To succeed, organisations need continuously strive to improve data, systems and processes for people using digital solutions; it is not simply digitising paper processes. The challenge of digital transformation is to work with people, but how can you build systems that serve them well to achieve and deliver more in a customer-focused way? Innovators will relish the opportunity to adopt new technology, but laggars are often waiting for proof that this will help them deliver better services or products. The challenge is that the adoption of digital solutions varies significantly from one person to the next, one team to the next and one organisation to the next. In this keynote, there will be a discussion of the industry landscape followed by takeaways that will help digital transformation in your organization.
1. Do more than get the basics right
2. Build confidence in changes through better use of data
3. How to oversee delivery while considering strategy
CuRious about R in Power BI? End to end R in Power BI for beginners Jen Stirrup
R is a widely used open-source statistical software environment used by over 2 million data scientists and analysts. It is based on the S programming language and is developed by the R Foundation. R provides a flexible and powerful environment for statistical analysis, modeling, and data visualization. Some key advantages include being free, having an extensive community for support, and allowing for automated replication through scripting. However, it also has some drawbacks like having a steep learning curve and scripts sometimes being difficult to understand.
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...Jen Stirrup
Artificial Intelligence has been receiving some bad press recently, with respect to its ethical consequences in terms of changes to working conditions, deepfake technology and even job losses. Organizations are concerned about bias in their data, perpetuating stereotypes and neglecting responsibility. How can AI systems treat all people fairly? What about concerns of safety and reliability?
In this keynote, we will explore the toolkits available in Azure to help businesses to navigate the complex ethics environment. Join this session to understand what Microsoft can offer in terms of supporting organisations to consider ethics as an integral part of their AI solutions.
Introduction to Analytics with Azure Notebooks and PythonJen Stirrup
Introduction to Analytics with Azure Notebooks and Python for Data Science and Business Intelligence. This is one part of a full day workshop on moving from BI to Analytics
When looking at Sales Analytics, where should you start? What should you measure? This session provides ideas on sales metrics, implemented in Power BI
This document provides guidance on creating an effective digital marketing analytics dashboard using Power BI. It recommends connecting to Google Analytics as a primary data source and including visualizations of key performance indicators (KPIs) like impressions, clicks, and spending over time. The dashboard should allow users to interact with the data by selecting specific time periods to analyze and compare metrics. Color coding and tooltips can also help users understand relationships in the data and drill down into further details.
Diversity and inclusion for the newbies and doersJen Stirrup
This presentation is aimed at people who want to *do* something positive for diversity and inclusion in their workplaces and communities, but don't know where to start to have a quick impact. I've made up a checklist of 7 'E's to help people along. We cover crucial topics such as: • What can we do to tackle unconscious bias in our systems, solutions and interactions with others? • How can we be more inclusive towards others? • How can we encourage and mentor younger generations to get involved in STEM topics and technical roles both as leaders and in the communities of people who surround us? I hope you enjoy this interactive and thought-provoking discussion of diversity and inclusion, aimed at people who want to get started and do something positive and impactful to help others.
Artificial Intelligence from the Business perspectiveJen Stirrup
What is AI from the Business perspective? In this presentation, Jen Stirrup discusses the 8 'C's of Artificial Intelligence from the business leadership perspective.
How to be successful with Artificial Intelligence - from small to successJen Stirrup
Keynote from AI World Congress in October 2019. Artificial Intelligence isn't just for the technies; it is crucial that business-oriented individuals adopt this technology, which can be conceived as the fourth industrial age. Artificial intelligence is becoming closer to being a a part of our daily lives through the use of technologies like virtual assistants such as Alexa, smart homes, and automated customer service. Now, we are running the race not just to win, but to survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and futurist ideas are developing into reality at accelerated rates.
How can you help your your company to evolve, adapt and succeed using Artificial Intelligence to stay at the forefront of the competition, and win the race for AI adoption in your organization? What are the potential issues, complications and benefits that artificial intelligence could bring to us and our organisations? In this session, Jen Stirrup will explain the quick wins to win the Red Queen's Race in Artificial Intelligence.
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...Jen Stirrup
Artificial Intelligence is popularised in fiction films such as “The Terminator” and “AI: Artificial Intelligence”. Now, artificial intelligence is becoming closer to being a part of our daily lives through the use of technologies like virtual assistants such as Cortana, smart homes, and automated customer service.
Now, we are running the Red Queen’s race not just to win, but to survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and futurist ideas are developing into reality at accelerated rates.
How can you help your your company to evolve, adapt and succeed using Artificial Intelligence to stay at the forefront of the competition, and win the Red Queen’s Race? What are the potential issues, complications and benefits that artificial intelligence could bring to us and our organisations?
In this keynote, Jen Stirrup explains the quick wins to win the Red Queen’s Race, using demos from Microsoft technologies such as AutoML to help you and your organisation win the Red Queen’s race.
Data Visualization dataviz superpower! Guidelines on using best practice data visualization principles for Power BI, Excel, SSRS, Tableau and other great tools!
R - what do the numbers mean? #RStats This is the presentation for my Demo at Orlando Live60 AILIve. We go through statistics interpretation with examples
Artificial Intelligence and Deep Learning in Azure, CNTK and TensorflowJen Stirrup
Artificial Intelligence and Deep Learning in Azure, using Open Source technologies CNTK and Tensorflow. The tutorial can be found on GitHub here: https://github.com/Microsoft/CNTK/tree/master/Tutorials
and the CNTK video can be found here: https://youtu.be/qgwaP43ZIwA
Blockchain Demystified for Business Intelligence ProfessionalsJen Stirrup
Blockchain is a transformational technology with the potential to extend digital transformation beyond an organization and into the processes it shares with suppliers, customers, and partners.
What is blockchain? What can it do for my organization? How can your organisation manage a blockchain implementation? How does it work in Azure?
Join this session to learn about blockchain and see it in action. We will also discuss the use cases for blockchain, and whether it is here to stay.
Examples of the worst data visualization everJen Stirrup
This document summarizes an event called SQL Saturday Cork where Jen Stirrup gave a presentation on data visualizations. The document includes objectives for the presentation such as discussing inaccurate data sources and the use of dark colors to represent higher values. It also includes examples of Zimbabwean inflation rates from 1980 to 2008 shown in a table and chart to illustrate how data can be visualized.
Digital Transformation for the Human Resources LeaderJen Stirrup
The document discusses digital transformation for HR leaders. It provides advice on how HR can effectively manage digital transformation through principles like:
1) Communicating change and ensuring employee buy-in for new technologies.
2) Carefully planning the transformation journey and getting feedback to adjust.
3) Engaging and retaining employees through the changes using data and visualization.
Selecting the right digital HR platforms requires research, putting business needs first, and testing projects before full implementation to define success. The biggest challenge is supporting employees through technological changes while improving their working lives.
Digital Pragmatism with Business Intelligence, Big Data and Data VisualisationJen Stirrup
Contact details:
Jen.Stirrup@datarelish.com
In a world where the HiPPO’s (Highest Paid Person’s Opinion) is final, how can we use technology to drive the organisation towards data-driven decision making as part of their organizational DNA? R provides a range of functionality in machine learning, but we need to expose its richness in a world where it is made accessible to decision makers. Using Data Storytelling with R, we can imprint data in the culture of the organization by making it easily accessible to everyone, including decision makers. Together, the insights and process of machine learning are combined with data visualisation to help organisations derive value and insights from big and little data.
Hire a private investigator to get cell phone recordsHackersList
Learn what private investigators can legally do to obtain cell phone records and track phones, plus ethical considerations and alternatives for addressing privacy concerns.
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsMydbops
This presentation, delivered at the Postgres Bangalore (PGBLR) Meetup-2 on June 29th, 2024, dives deep into connection pooling for PostgreSQL databases. Aakash M, a PostgreSQL Tech Lead at Mydbops, explores the challenges of managing numerous connections and explains how connection pooling optimizes performance and resource utilization.
Key Takeaways:
* Understand why connection pooling is essential for high-traffic applications
* Explore various connection poolers available for PostgreSQL, including pgbouncer
* Learn the configuration options and functionalities of pgbouncer
* Discover best practices for monitoring and troubleshooting connection pooling setups
* Gain insights into real-world use cases and considerations for production environments
This presentation is ideal for:
* Database administrators (DBAs)
* Developers working with PostgreSQL
* DevOps engineers
* Anyone interested in optimizing PostgreSQL performance
Contact info@mydbops.com for PostgreSQL Managed, Consulting and Remote DBA Services
Video traffic on the Internet is constantly growing; networked multimedia applications consume a predominant share of the available Internet bandwidth. A major technical breakthrough and enabler in multimedia systems research and of industrial networked multimedia services certainly was the HTTP Adaptive Streaming (HAS) technique. This resulted in the standardization of MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) which, together with HTTP Live Streaming (HLS), is widely used for multimedia delivery in today’s networks. Existing challenges in multimedia systems research deal with the trade-off between (i) the ever-increasing content complexity, (ii) various requirements with respect to time (most importantly, latency), and (iii) quality of experience (QoE). Optimizing towards one aspect usually negatively impacts at least one of the other two aspects if not both. This situation sets the stage for our research work in the ATHENA Christian Doppler (CD) Laboratory (Adaptive Streaming over HTTP and Emerging Networked Multimedia Services; https://athena.itec.aau.at/), jointly funded by public sources and industry. In this talk, we will present selected novel approaches and research results of the first year of the ATHENA CD Lab’s operation. We will highlight HAS-related research on (i) multimedia content provisioning (machine learning for video encoding); (ii) multimedia content delivery (support of edge processing and virtualized network functions for video networking); (iii) multimedia content consumption and end-to-end aspects (player-triggered segment retransmissions to improve video playout quality); and (iv) novel QoE investigations (adaptive point cloud streaming). We will also put the work into the context of international multimedia systems research.
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
Details of description part II: Describing images in practice - Tech Forum 2024BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and transcript: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Erasmo Purificato
Slide of the tutorial entitled "Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Emerging Trends" held at UMAP'24: 32nd ACM Conference on User Modeling, Adaptation and Personalization (July 1, 2024 | Cagliari, Italy)
AI_dev Europe 2024 - From OpenAI to Opensource AIRaphaël Semeteys
Navigating Between Commercial Ownership and Collaborative Openness
This presentation explores the evolution of generative AI, highlighting the trajectories of various models such as GPT-4, and examining the dynamics between commercial interests and the ethics of open collaboration. We offer an in-depth analysis of the levels of openness of different language models, assessing various components and aspects, and exploring how the (de)centralization of computing power and technology could shape the future of AI research and development. Additionally, we explore concrete examples like LLaMA and its descendants, as well as other open and collaborative projects, which illustrate the diversity and creativity in the field, while navigating the complex waters of intellectual property and licensing.
AC Atlassian Coimbatore Session Slides( 22/06/2024)apoorva2579
This is the combined Sessions of ACE Atlassian Coimbatore event happened on 22nd June 2024
The session order is as follows:
1.AI and future of help desk by Rajesh Shanmugam
2. Harnessing the power of GenAI for your business by Siddharth
3. Fallacies of GenAI by Raju Kandaswamy
Transcript: Details of description part II: Describing images in practice - T...BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and slides: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Performance Budgets for the Real World by Tammy EvertsScyllaDB
Performance budgets have been around for more than ten years. Over those years, we’ve learned a lot about what works, what doesn’t, and what we need to improve. In this session, Tammy revisits old assumptions about performance budgets and offers some new best practices. Topics include:
• Understanding performance budgets vs. performance goals
• Aligning budgets with user experience
• Pros and cons of Core Web Vitals
• How to stay on top of your budgets to fight regressions
How to Avoid Learning the Linux-Kernel Memory ModelScyllaDB
The Linux-kernel memory model (LKMM) is a powerful tool for developing highly concurrent Linux-kernel code, but it also has a steep learning curve. Wouldn't it be great to get most of LKMM's benefits without the learning curve?
This talk will describe how to do exactly that by using the standard Linux-kernel APIs (locking, reference counting, RCU) along with a simple rules of thumb, thus gaining most of LKMM's power with less learning. And the full LKMM is always there when you need it!
In this follow-up session on knowledge and prompt engineering, we will explore structured prompting, chain of thought prompting, iterative prompting, prompt optimization, emotional language prompts, and the inclusion of user signals and industry-specific data to enhance LLM performance.
Join EIS Founder & CEO Seth Earley and special guest Nick Usborne, Copywriter, Trainer, and Speaker, as they delve into these methodologies to improve AI-driven knowledge processes for employees and customers alike.
Blockchain and Cyber Defense Strategies in new genre timesanupriti
Explore robust defense strategies at the intersection of blockchain technology and cybersecurity. This presentation delves into proactive measures and innovative approaches to safeguarding blockchain networks against evolving cyber threats. Discover how secure blockchain implementations can enhance resilience, protect data integrity, and ensure trust in digital transactions. Gain insights into cutting-edge security protocols and best practices essential for mitigating risks in the blockchain ecosystem.
Interaction Latency: Square's User-Centric Mobile Performance MetricScyllaDB
Mobile performance metrics often take inspiration from the backend world and measure resource usage (CPU usage, memory usage, etc) and workload durations (how long a piece of code takes to run).
However, mobile apps are used by humans and the app performance directly impacts their experience, so we should primarily track user-centric mobile performance metrics. Following the lead of tech giants, the mobile industry at large is now adopting the tracking of app launch time and smoothness (jank during motion).
At Square, our customers spend most of their time in the app long after it's launched, and they don't scroll much, so app launch time and smoothness aren't critical metrics. What should we track instead?
This talk will introduce you to Interaction Latency, a user-centric mobile performance metric inspired from the Web Vital metric Interaction to Next Paint"" (web.dev/inp). We'll go over why apps need to track this, how to properly implement its tracking (it's tricky!), how to aggregate this metric and what thresholds you should target.
6. What to use and when?
A fully managed, elastic data warehouse with security at
every level of scale at no extra cost
Azure Synapse Analytics
Fast, easy and collaborative Apache Spark-based analytics
platform
Azure Databricks
A fully managed cloud Hadoop and Spark service backed by
99.9% SLA for your enterprise
HDInsight
A fully managed cloud service that enables you to easily
build, deploy and share predictive analytics solutions
Machine Learning
An on-demand, real-time stream processing service with
enterprise-grade security, auditing and support
Stream Analytics
7. What to use and when?
A no-limits data lake built to support massively parallel
analytics
Data Lake Store
A fully managed on-demand pay-per-job analytics
service with enterprise-grade security, auditing and
support
Data Lake Analytics
An enterprise-wide metadata catalogue that makes
data asset discovery simple
Azure Data Catalog
A data integration service to orchestrate and automate
data movement and transformation
Data Factory
10. Azure Data Explorer
Jupyter Notebook allows you to create and share documents that contain live code, equations,
visualizations, and explanatory text.
We are excited to announce KQL magic commands which extends the functionality of the Python kernel in
Jupyter Notebook. KQL magic allows you to write KQL queries natively and query data from Microsoft
Azure Data Explorer. You can easily interchange between Python and KQL, and visualize data using rich
Plot.ly library integrated with KQL render commands. KQL magic supports Azure Data Explorer,
Application Insights, and Log Analytics as data sources to run queries against.
12. Azure Data Explorer
Fast and highly scalable data exploration
service.
Azure Data Explorer is a fast, fully managed
data analytics service for real-time analysis on
large volumes of data streaming from
applications, websites, IoT devices and more.
13. Azure Data Explorer
● Low-latency ingestion
● Fast read-only query with high concurrency
● Query large amounts of structured, semi-
structured (JSON-like nested types) and
unstructured (free-text) data.
18. Data Factory
● No code or maintenance required to build
hybrid ETL and ELT pipelines within the
Data Factory visual environment.
● Cost-efficient and fully managed serverless
cloud data integration tool that scales on
demand.
19. Data Factory
● Azure security measures to connect to on-
premises, cloud-based and software-as-a-
service apps with peace of mind.
● SSIS integration runtime to easily move
SSIS ETL workloads into the cloud with
minimal effort.
20. Data Factory
● Ingest, move, prepare, transform and
process your data in a few clicks, and
complete your data modelling within the
accessible visual environment.
21. Why Data Factory?
● Orchestrate, monitor & schedule data
pipelines
● Automatic cloud resource management
● Single pane of glass
25. Stream Analytics
● Build streaming pipelines in minutes - Run complex analytics with no need to learn new
processing frameworks or provision virtual machines (VMs) or clusters. Use familiar SQL language
that is extensible with JavaScript and C# custom code for more advanced use cases. Easily enable
scenarios such as low-latency dashboarding, streaming ETL and real-time alerting with one-click
integration across sources and sinks.
● Run mission-critical workloads with subsecond latencies - Get guaranteed, “exactly once”
event processing with 99.9% availability and built-in recovery capabilities. Easily set up a continuous
integration and continuous delivery (CI-CD) pipeline and achieve subsecond latencies on your most
demanding workloads.
● Deploy in the cloud and on the edge - Bring real-time insights and analytics capabilities closer to
where your data originates. Enable new scenarios with true hybrid architectures for stream
processing and run the same query in the cloud or on the edge.
● Power real-time analytics with artificial intelligence - Take advantage of built-in machine learning
(ML) models to shorten time to insights. Use ML-based capabilities to perform anomaly detection
directly in your streaming jobs with Azure Stream Analytics.
28. Event Hubs
● A hyper-scale telemetry ingestion service
that collects, transforms and stores millions
of events.
● Event Hubs is a fully managed, real-time
data ingestion service that’s simple, trusted
and scalable.
29. Event Hubs
● Integrate seamlessly with other Azure
services to unlock valuable insights.
● Experience real-time data ingestion and
microbatching on the same stream.
30. Event Hubs
● Focus on drawing insights from your data instead of managing
infrastructure. Build real-time big data pipelines and respond to
business challenges right away.
● Build real-time data pipelines with just a couple of clicks. Seamlessly
integrate with Azure data services to uncover insights faster.
31. Event Hubs
● Ingest millions of events per second -
Continuously ingress data from hundreds of
thousands of sources with low latency and
configurable time retention.
33. Analysis Services
Focus on solving business problems, not
learning new skills, when you use the familiar,
integrated development environment of Visual
Studio. Easily deploy your existing SQL
Server 2016 tabular models to the cloud.
35. Data Lake Analytics
● Easily develop and run massively parallel
data transformation and processing
programs in U-SQL, R, Python and .NET
over petabytes of data. With no
infrastructure to manage, you can process
data on demand, scale instantly and only
pay per job.
36. Data Lake Analytics
● Process big data jobs in seconds with
Azure Data Lake Analytics. There is no
infrastructure to worry about because there
are no servers, virtual machines or clusters
to wait for, manage or tune.
37. Data Lake Analytics
● Instantly scale the processing power,
measured in Azure Data Lake Analytics
Units (AU), from one to thousands for each
job. You only pay for the processing that
you use per job.
38. Data Lake Analytics
● U-SQL is a simple, expressive and
extensible language that allows you to write
code once and have it automatically
parallelised for the scale you need.
39. Data Lake Analytics
● Process petabytes of data for diverse
workload categories such as querying, ETL,
analytics, machine learning, machine
translation, image processing and
sentiment analysis by leveraging existing
libraries written in .NET languages, R or
Python.
45. What is big data?
• When you have to innovate to collect,
store, organize, analyse and share it.
• - Werner Vogels, Amazon CTO
46. What is Big Data?
• Traditionally…..
– Physics Experiments
– Sensor data
– Satellite data
47. Now?
• Now: zettabytes in the cloud
expected by end of next year
• https://datarelish.net/2019/11/07/whats-
the-future-for-cloud-data-storage-clouds-of-
glass/
48. Azure Synapse
● Azure Synapse delivers insights from all your data,
across data warehouses and big data analytics
systems, with blazing speed.
● Data professionals can query both relational and non-
relational data at petabyte-scale using the familiar
SQL language
● Credit to the Microsoft team for help with these decks
49. Azure Synapse
● Azure Synapse is a limitless analytics service
that brings together enterprise data
warehousing and Big Data analytics.
● Query data on your terms, using either
serverless on-demand or provisioned
resources—at scale
51. Azure Synapse
• support for SSDT with Visual Studio 2019
• native platform integration with Azure
DevOps
• built-in continuous integration and
deployment (CI/CD) capabilities for
enterprise-level deployments
53. Best in class price
per performance
Developer
productivity
Intelligent workload
management
Data flexibility
Up to 94% less expensive
than competitors
Prioritize resources for
the most valuable
workloads
Ingest variety of data
sources to derive the
maximum benefit
Use preferred tooling for
SQL data warehouse
development
Industry-leading
security
Defense-in-depth
security and 99.9%
financially backed
availability SLA
Azure Synapse
54. DirectQuery Composite Models &
Aggregation Tables
The enterprise solution
Avoid data movement
Delegate query work to
the back-end source;
take advantage of Azure
SQL Data Warehouse’s
advanced features
Why choose? Import and
DirectQuery in a single
model
Keep summarized data
local; get detail data from
the source
Import
Great for small data
sources and personal
data discovery
Fine for CSV files,
spreadsheet data and
summarized OLTP data
Power BI
55. Best in class price
per performance
Developer
productivity
Intelligent workload
management
Data flexibility
Up to 94% less expensive
than competitors
Prioritize resources for
the most valuable
workloads
Ingest variety of data
sources to derive the
maximum benefit
Use preferred tooling for
SQL data warehouse
development
Industry-leading
security
Defense-in-depth
security and 99.9%
financially backed
availability SLA
Azure SQL Data Warehouse
56. Complete Data SecurityCategory Feature
Data Protection
Data in Transit
Data Encryption at Rest
Data Discovery and Classification
Access Control
Object Level Security (Tables/Views)
Row Level Security
Column Level Security
Dynamic Data Masking
SQL Login
Authentication Azure Active Directory
Multi-Factor Authentication
Virtual Networks
Network Security Firewall
Azure ExpressRoute
Thread Detection
Threat Protection Auditing
Vulnerability Assessment
57. Workload
Management
Scale-In Isolation
Predictablecost
Online elasticity
Efficient for unpredictableworkloads
No cacheeviction forscaling
Intra Cluster Workload Isolation
(Scale In)
Marketing
CREATE WORKLOAD GROUP Sales
WITH
(
[ MIN_PERCENTAGE_RESOURCE = 60 ]
[ CAP_PERCENTAGE_RESOURCE = 100 ]
[ MAX_CONCURRENCY = 6 ] )
40%
Compute
1000c DWU
60%
Sales
60%
100%
M I C R OSOFT C O NFIDE NTI AL
67. Making business data accessible
Provides a scalable, T-SQL-compatible query processing
framework for combining data from both universes
68. Polybase Purpose
Consumer Analyst Scientist
Data Volume Medium to Low Reasonable High -> Huge
Degree of Structure Very High Some Low ->None
Number of Users Very High Medium Low
Transformation Complexity Low Medium to High High
Analytics Complexity Low Medium Very High
Data
71. Agenda
• Why machine learning in SQL Server?
• How to leverage:
– SQL Compute context
– sp_execute_external_script features
– PREDICT T-SQL Function
• Call to action
• Questions
72. Why machine learning with SQL Server?
Reduce or eliminate
data movement with
in-database analytics
Operationalize
machine learning
models
Get enterprise scale,
performance, and
security
73. Machine Learning Services
• R/Python Integration Design
– Invokes runtime outside of SQL Server process
– Batch-oriented operations
• SQL Compute context
75. Any R/Python
IDE
Data Scientist
Workstation
Typical Machine Learning workflow against database
SQL Server
Pull Data1
train <-
sqlQuery(connection,
“select * from nyctaxi_sample”)
model <- glm(formula, train)
3
Model
Output
2 Execution
76. Any R/Python
IDE
Data Scientist
Workstation rx*
output
3
Machine Learning workflow using SQL compute context
Execution2
SQL Server 2017
SQL Server
R/Python Runtime
Machine Learning
Services
Script1
cc <- RxInSqlServer( connectionString,
computeContext)
rxLogit(formula, cc)
Model or
Prediction
s
4
79. Push data from SQL Server to
external runtime
sp_execute_external_
script
@input_data_1 = N’
SELECT * FROM
TrainingData’
InputDataset:
data.frame
OR
Pandas
dataframe
80. Read Files with R Server
• R can read almost all flat text files like
SPSS, CSV, TXT.
• Provide path direction and read file directory
from the given path into R.
81. RStudio & XDF File
• In order to convert our text files either CSV
or other text formats, into XDF format,
RStudio can easily handle this task.
• XDF file formats can only be read by R, and
they are very small in size as compared to
other files.
83. Convert File to XDF
• TXT or CSV files can be converted to XDF
format .
• XDF file formats can only be read by Rand
they are very small in size as compared to
other files.
84. rxCrossTabs
rxCrossTabs is used to create contingency
tables from cross- classifying factors using a
formula interface.
rxCrossTabs() is also used to compute sums
according to combinations of different variables
85. RxCube
• rxCube() performs a very similar function to
rxCrossTabs().
• It computes tabulated sums or means.
• rxCube() produces the sums or means in long
format rather than a table.
• This can be useful when we want to
aggregate data for further analysis within R.
86. dplyrXdf Package
The dplyr package is a popular toolkit for data
transformation and manipulation.
dplyr supports data frames, data tables (from
the data.table package)
The dplyrXdf package implements such a
backend for the xdf file format, a technology
supplied as part of Revolution R Enterprise.
87. Ggplot2
Ggplot2 allows you to create graphs that
represent both univariate and multivariate
numerical and categorical data in a
straightforward manner
This function can be used to create the most
common graph types.
It can create a very wide range of useful plots.
89. Custom visualizations with
rxSummary & rxCube
• RxCube
• Rxcube is similar to rxSummary but it
returns fewer statistical summaries and
therefore run faster.
• With y ~u : v as the formula, the rxCube
returns count and averages for column y
91. rxHistogram
• rxHistogram() is used to create a histogram
for the Close variable.
• Syntax: Function(formula, data, …)
• Formula = A formula that contains the
variable which you want to visualize. The
92. rxlinePlot
• Line or scatter plot use data from an .xdf file
or data frame
• Syntax: rxLinePlot(formula, data, …. )
• Formula = For this function, this formula
should have one variable on the left side of
the ~ that reflects the Y-axis, and one
93. rxDataSteps
• The rxDataStep function can be used to
process data in chunks.
• rxDataStep can be used to create and
transform subsets of data.
95. Subset Rows Of Data Using
Transform Argument
• A common use of rxDataStep is to create a
new data set with a subset of rows and
variables.
• For this purpose, we use the data frame of
our data as the input data set.
96. On-the-fly Transformation
• Analytical functions within
the RevoScaleR package use a formal
transformation function framework for
generating on the fly variables
• The RevoScaleR approach is to use the
97. In-data transformation
• There are two main approaches to in-data
transformation:
• Define an external based R function and
reference it.
• Define an embedded transformation as an
input to a transforms argument on another
function.
98. Generate a data frame
• A data frame is a table or a two-
dimensional array-like structure in which
each column contains values of one variable
and each row contains one set of values
from each column.
99. Generate a Data Frame
• Code: SalesData <- file.path("D:/773Demo",
"CustomerSalesInfo.xdf")
• SalesDataFrame <- rxImport(inData =
SalesData)
100. POSIXct & POSIXIt
• R provides several options for dealing
with date and date/time data. The POSIXct
and POSIXlt classes allow for dates and
times with control for time zones.
101. Transform functions
▪ It is a generic function which does useful things
with data frames.
▪ Embedded transformations provide instructions
within a formula, through arguments on a
function.
▪ Using just arguments, you can manipulate data
using transformations.
102. Summary
Improve performance of your ML scripts by using:
– SQL Compute context from client (rx* functions)
– Streaming to reduce memory usage
– Trivial parallelism for scoring (predict or rxPredict)
– Parallel training and scoring using rx* functions
– Native PREDICT function for low latency scoring
103. Call to action
• Resources
– SQL Server Samples on GitHub – R Services &
ML Services
– Getting started tutorials: AKA.MS/MLSQLDEV
– Configure instance: SSMS Reports for ML
Services
– ML cheat sheet
– Microsoft documentation: SQL Server Machine