Service generated big data and big data-as-a-serviceJYOTIR MOY
This document provides an overview of service-generated big data and big data-as-a-service. It discusses three types of service-generated big data: service trace logs, service QoS information, and service relationship data. It also describes big data-as-a-service which includes big data infrastructure-as-a-service, platform-as-a-service, and analytics software-as-a-service to provide common big data services and analyze the large volumes of service data. The business opportunities of big data-as-a-service are also briefly discussed.
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
Cloud Computing Evolution
Why Cloud Computing needed?
Cloud Computing Models
Cloud Solutions
Cloud Jobs opportunities
Criteria for Big Data
Big Data challenges
Technologies to process Big Data- Hadoop
Hadoop History and Architecture
Hadoop Eco-System
Hadoop Real-time Use cases
Hadoop Job opportunities
Hadoop and SAP HANA integration
Summary
Core concepts and Key technologies - Big Data AnalyticsKaniska Mandal
Big data analytics has evolved beyond batch processing with Hadoop to extract intelligence from data streams in real time. New technologies preserve data locality, allow real-time processing and streaming, support complex analytics functions, provide rich data models and queries, optimize data flow and queries, and leverage CPU caches and distributed memory for speed. Frameworks like Spark and Shark improve on MapReduce with in-memory computation and dynamic resource allocation.
Big data refers to datasets that are too large to be managed by traditional database tools. It is characterized by volume, velocity, and variety. Hadoop is an open-source software framework that allows distributed processing of large datasets across clusters of computers. It works by distributing storage across nodes as blocks and distributing computation via a MapReduce programming paradigm where nodes process data in parallel. Common uses of big data include analyzing social media, sensor data, and using machine learning on large datasets.
The document outlines an agenda for a presentation on big data. It discusses key topics like the state of big data adoption, a holistic approach to big data, five high value use cases, technical components, and the future of big data and cloud. The presentation aims to provide an overview of big data and how organizations can take a comprehensive approach to leveraging their data assets.
Great Expectations is an open-source Python library that helps validate, document, and profile data to maintain quality. It allows users to define expectations about data that are used to validate new data and generate documentation. Key features include automated data profiling, predefined and custom validation rules, and scalability. It is used by companies like Vimeo and Heineken in their data pipelines. While helpful for testing data, it is not intended as a data cleaning or versioning tool. A demo shows how to initialize a project, validate sample taxi data, and view results.
Big Data & Analytics continues to redefine business. Data has transitioned from an underused asset to the lifeblood of the organisation, and a critical component of business intelligence, insight and strategy.
Big Data Scotland is the largest annual data analytics conference held in Scotland: it is supported by ScotlandIS and The Data Lab and free for delegates to attend. The conference is geared towards senior technologists and business leaders and aims to provide a unique forum for knowledge exchange, discussion and cross-pollination.
The programme will explore the evolution of data analytics; looking at key tools and techniques and how these can be applied to deliver practical insight and value. Presentations will span a wide array of topics from Data Wrangling and Visualisation to AI, Chatbots and Industry 4.0.
Key Topics
• Tools and techniques
• Corporate data culture, business processes, digital transformation
• Business intelligence, trends, decision making
• AI, Real-time Analytics, IoT, Industry 4.0, Robotics
• Security, regulation, privacy, consent, anonymization
• Data visualisation, interpretation and communication
• CRM and Personalisation
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012Gigaom
The document discusses the 3 V's of big data: volume, velocity, and variety. It provides examples of how each V impacts data analysis and storage. It also discusses how text data has been a major driver of big data growth and challenges. The key challenges are processing large and diverse datasets quickly enough to keep up with real-time data streams and demands.
This report examines the rise of big data and analytics used to analyze large volumes of data. It is based on a survey of 302 BI professionals and interviews. Most organizations have implemented analytical platforms to help analyze growing amounts of structured data. New technologies also analyze semi-structured data like web logs and machine data. While reports and dashboards serve casual users, more advanced analytics are needed for power users to fully leverage big data.
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyNishant Gandhi
This document provides an introduction to big data, including definitions of big data and why it is important. It discusses characteristics of big data like volume, velocity, variety and veracity. It provides examples of big data applications in various industries like GE, Boeing, social media, finance, CERN, journalism, politics and more. It also introduces NoSQL and the CAP theorem, and concludes that big data is changing business and technology by enabling new insights from data to reduce costs and optimize operations.
This document discusses big data, including the large amounts of data being collected daily, challenges with traditional DBMS solutions, the need for new approaches like Hadoop and Aster Data to handle large volumes of structured and unstructured data, techniques for analyzing big data, and case studies of companies like Mobclix and Yahoo using big data solutions.
Implementation of Big Data infrastructure and technology can be seen in various industries like banking, retail, insurance, healthcare, media, etc. Big Data management functions like storage, sorting, processing and analysis for such colossal volumes cannot be handled by the existing database systems or technologies. Frameworks come into picture in such scenarios. Frameworks are nothing but toolsets that offer innovative, cost-effective solutions to the problems posed by Big Data processing and helps in providing insights, incorporating metadata and aids decision making aligned to the business needs.
Disclaimer :
The images, company, product and service names that are used in this presentation, are for illustration purposes only. All trademarks and registered trademarks are the property of their respective owners.
Data/Image collected from various sources from Internet.
Intention was to present the big picture of Big Data & Hadoop
This document outlines the course content for a Big Data Analytics course. The course covers key concepts related to big data including Hadoop, MapReduce, HDFS, YARN, Pig, Hive, NoSQL databases and analytics tools. The 5 units cover introductions to big data and Hadoop, MapReduce and YARN, analyzing data with Pig and Hive, and NoSQL data management. Experiments related to big data are also listed.
At the Technology Trends seminar, with HCMC University of Polytechnics' lecturers, KMS Technology's CTO delivered a topic of Big Data, Cloud Computing, Mobile, Social Media and In-memory Computing.
This document provides an overview of big data concepts and technologies. It discusses the growth of data, characteristics of big data including volume, variety and velocity. Popular big data technologies like Hadoop, MapReduce, HDFS, Pig and Hive are explained. NoSQL databases like Cassandra, HBase and MongoDB are introduced. The document also covers massively parallel processing databases and column-oriented databases like Vertica. Overall, the document aims to give the reader a high-level understanding of the big data landscape and popular associated technologies.
This document provides an overview of big data, including its definition, characteristics, sources, tools, applications, risks, benefits and future. Big data is characterized by large volumes of data in various formats that are difficult to process using traditional data management and analysis systems. It is generated from sources like user interactions, sensors and systems logs. Tools like Hadoop and NoSQL databases enable storing, processing and analyzing big data. Organizations apply big data analytics to areas such as healthcare, retail and security. While big data poses privacy and management challenges, it also provides opportunities to gain insights and make improved decisions. The big data industry is growing rapidly and expected to be worth over $100 billion.
This document provides an overview of big data including:
- It defines big data and discusses its key characteristics of volume, velocity, and variety.
- It describes sources of big data like social media, sensors, and user clickstreams. Tools for big data include Hadoop, MongoDB, and cloud computing.
- Applications of big data analytics include smarter healthcare, traffic control, and personalized marketing. Risks include privacy and high costs. Benefits include better decisions, opportunities for new businesses, and improved customer experiences.
- The future of big data is strong with worldwide revenues projected to grow from $5 billion in 2012 to over $50 billion in 2017, creating millions of new jobs for data scientists and analysts
Content1. Introduction2. What is Big Data3. Characte.docxdickonsondorris
Content
1. Introduction
2. What is Big Data
3. Characteristic of Big Data
4. Storing,selecting and processing of Big Data
5. Why Big Data
6. How it is Different
7. Big Data sources
8. Tools used in Big Data
9. Application of Big Data
10. Risks of Big Data
11. Benefits of Big Data
12. How Big Data Impact on IT
13. Future of Big Data
Introduction
• Big Data may well be the Next Big Thing in the IT
world.
• Big data burst upon the scene in the first decade of the
21st century.
• The first organizations to embrace it were online and
startup firms. Firms like Google, eBay, LinkedIn, and
Facebook were built around big data from the
beginning.
• Like many new information technologies, big data can
bring about dramatic cost reductions, substantial
improvements in the time required to perform a
computing task, or new product and service offerings.
• ‘Big Data’ is similar to ‘small data’, but bigger in
size
• but having data bigger it requires different
approaches:
– Techniques, tools and architecture
• an aim to solve new problems or old problems in a
better way
• Big Data generates value from the storage and
processing of very large quantities of digital
information that cannot be analyzed with
traditional computing techniques.
What is BIG DATA?
What is BIG DATA
• Walmart handles more than 1 million customer
transactions every hour.
• Facebook handles 40 billion photos from its user base.
• Decoding the human genome originally took 10years to
process; now it can be achieved in one week.
Three Characteristics of Big Data V3s
Volume
• Data
quantity
Velocity
• Data
Speed
Variety
• Data
Types
1st Character of Big Data
Volume
•A typical PC might have had 10 gigabytes of storage in 2000.
•Today, Facebook ingests 500 terabytes of new data every day.
•Boeing 737 will generate 240 terabytes of flight data during a single
flight across the US.
• The smart phones, the data they create and consume; sensors
embedded into everyday objects will soon result in billions of new,
constantly-updated data feeds containing environmental, location,
and other information, including video.
2nd Character of Big Data
Velocity
• Clickstreams and ad impressions capture user behavior at
millions of events per second
• high-frequency stock trading algorithms reflect market
changes within microseconds
• machine to machine processes exchange data between
billions of devices
• infrastructure and sensors generate massive log data in real-
time
• on-line gaming systems support millions of concurrent
users, each producing multiple inputs per second.
3rd Character of Big Data
Variety
• Big Data isn't just numbers, dates, and strings. Big
Data is also geospatial data, 3D data, audio and
video, and unstructured text, including log files and
social media.
• Traditional database systems were designed to
address smaller volumes of structured data, fewer
updates or a predictable, consistent data stru.
This document provides an overview of big data. It defines big data as large volumes of diverse data that are growing rapidly and require new techniques to capture, store, distribute, manage, and analyze. The key characteristics of big data are volume, velocity, and variety. Common sources of big data include sensors, mobile devices, social media, and business transactions. Tools like Hadoop and MapReduce are used to store and process big data across distributed systems. Applications of big data include smarter healthcare, traffic control, and personalized marketing. The future of big data is promising with the market expected to grow substantially in the coming years.
This document provides an overview of big data including:
- It defines big data and describes its three key characteristics: volume, velocity, and variety.
- It explains how big data is stored, selected, and processed using techniques like Hadoop and NoSQL databases.
- It discusses some common sources of big data, tools used to analyze it, and applications of big data analytics across different industries.
This document provides an overview of big data and Hadoop. It defines big data as high-volume, high-velocity, and high-variety data that requires new techniques to capture value. Hadoop is introduced as an open-source framework for distributed storage and processing of large datasets across clusters of computers. Key components of Hadoop include HDFS for storage and MapReduce for parallel processing. Benefits of Hadoop are its ability to handle large amounts of structured and unstructured data quickly and cost-effectively at large scales.
Every day we roughly create 2.5 Quintillion bytes of data; 90% of the worlds collected data has been generated only in the last 2 years. In this slide, learn the all about big data
in a simple and easiest way.
Big Data Analytics & Trends Presentation discusses what big data is, why it's important, definitions of big data, data types and landscape, characteristics of big data like volume, velocity and variety. It covers data generation points, big data analytics, example scenarios, challenges of big data like storage and processing speed, and Hadoop as a framework to solve these challenges. The presentation differentiates between big data and data science, discusses salary trends in Hadoop/big data, and future growth of the big data market.
This document provides an overview of big data, including its definition, characteristics, sources, tools, applications, risks, benefits and future. Big data is characterized by its volume, velocity and variety. It is generated from sources like users, applications, sensors and more. Tools like Hadoop and databases are used to store, process and analyze big data. Big data analytics can provide benefits across many industries and applications. However, it also poses risks around privacy, costs and skills that must be addressed. The future of big data is promising, with the market expected to grow significantly in the coming years.
This document provides an overview of big data presented by five individuals. It defines big data, discusses its three key characteristics of volume, velocity and variety. It explains how big data is stored, selected and processed using techniques like Hadoop and MapReduce. Examples of big data sources and tools are provided. Applications of big data across various industries are highlighted. Both the risks and benefits of big data are summarized. The future growth of big data and its impact on IT is also outlined.
This document discusses big data, including what it is, common data sources, its volume, velocity and variety characteristics, solutions like Hadoop and its HDFS and MapReduce components, and the impact and future of big data. It explains that big data refers to large and complex datasets that are difficult to process using traditional tools. Hadoop provides a framework to store and process big data across clusters of commodity hardware.
This document provides an overview of big data and Hadoop. It defines big data as large volumes of structured, semi-structured and unstructured data that is growing exponentially and is too large for traditional databases to handle. It discusses the 4 V's of big data - volume, velocity, variety and veracity. The document then describes Hadoop as an open-source framework for distributed storage and processing of big data across clusters of commodity hardware. It outlines the key components of Hadoop including HDFS, MapReduce, YARN and related modules. The document also discusses challenges of big data, use cases for Hadoop and provides a demo of configuring an HDInsight Hadoop cluster on Azure.
This document provides an overview of big data in a seminar presentation. It defines big data, discusses its key characteristics of volume, velocity and variety. It describes how big data is stored, selected and processed. Examples of big data sources and tools used are provided. The applications and risks of big data are summarized. Benefits to organizations from big data analytics are outlined, as well as its impact on IT and future growth prospects.
Bigdata.
Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them. Challenges include capture, storage, analysis, data curation, search, sharing, transfer, visualization, querying, updating and information privacy. The term "big data" often refers simply to the use of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set. "There is little doubt that the quantities of data now available are indeed large, but that’s not the most relevant characteristic of this new data ecosystem."[2] Analysis of data sets can find new correlations to "spot business trends, prevent diseases, combat crime and so on."[3] Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet search, fintech, urban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorology, genomics,[4] connectomics, complex physics simulations, biology and environmental research.[5]
Data sets grow rapidly - in part because they are increasingly gathered by cheap and numerous information-sensing Internet of things devices such as mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks.[6][7] The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s;[8] as of 2012, every day 2.5 exabytes (2.5×1018) of data are generated.[9] One question for large enterprises is determining who should own big-data initiatives that affect the entire organization.[10]
Relational database management systems and desktop statistics- and visualization-packages often have difficulty handling big data. The work may require "massively parallel software running on tens, hundreds, or even thousands of servers".[11] What counts as "big data" varies depending on the capabilities of the users and their tools, and expanding capabilities make big data a moving target. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration."
1) Big data is being generated from many sources like web data, e-commerce purchases, banking transactions, social networks, science experiments, and more. The volume of data is huge and growing exponentially.
2) Big data is characterized by its volume, velocity, variety, and value. It requires new technologies and techniques for capture, storage, analysis, and visualization.
3) Analyzing big data can provide valuable insights but also poses challenges related to cost, integration of diverse data types, and shortage of data science experts. New platforms and tools are being developed to make big data more accessible and useful.
This presentation provides an overview of big data. It introduces the group members presenting and defines big data as large amounts of data that cannot be analyzed using traditional methods due to the volumes of data and the speed at which it is generated and needs to be processed. It describes the three main characteristics of big data as volume, velocity and variety. It also discusses storing, selecting, processing and analyzing big data and provides examples of big data sources and tools used. Potential applications and risks/benefits of big data are also summarized.
This document outlines a seminar presentation on big data. It begins with an introduction that defines big data and notes how it emerged in the early 21st century mainly through online firms. It then covers the three key characteristics of big data - volume, velocity and variety. Other sections discuss storing, selecting and processing big data, as well as tools used and applications. Risks, benefits and the future impact and growth of big data are also summarized. The presentation provides an overview of the key concepts regarding big data.
This document provides an overview of big data, including its definition, characteristics, sources, tools, applications, risks and benefits. It defines big data as large volumes of diverse data that can be analyzed to reveal patterns and trends. The three key characteristics are volume, velocity and variety. Examples of big data sources include social media, sensors and user data. Tools used for big data include Hadoop, MongoDB and analytics programs. Big data has many applications and benefits but also risks regarding privacy and regulation. The future of big data is strong with the market expected to grow significantly in coming years.
Similar to Big Data in Action : Operations, Analytics and more (20)
A pre-migration assessment is recommended before migrating to Sitecore 9.2 from an older version or other platform to avoid data loss and issues. The assessment includes analyzing current environments and data, identifying potential migration issues, and ensuring a smooth process. It provides a report on compatibility, approaches, time estimates, and infrastructure changes needed. This helps optimize the content strategy and take advantage of Sitecore 9.2's capabilities.
How Salesforce FSL is redefining field service operationsSoftweb Solutions
Field Service Lightning (FSL) connects your entire workforce on a single platform and empowers them to deliver better on-site service. With this AI-powered field service solution, you can enable your field technicians to improve the first-visit resolution, enhance employee productivity in the field and automate appointment scheduling - https://bit.ly/32OHJO9
Salesforce is a popular CRM software while ERP systems integrate back-office functions. Integrating Salesforce with an ERP system can provide benefits like a centralized 360-degree view of customers, better insights from access to more detailed data, and improved business processes through automated workflows. Key use cases of integration include easily managing inventory, enabling accurate sales quotes and proposals, and maintaining consistent customer profiles across systems.
A complete Salesforce implementation guide on how to implement SalesforceSoftweb Solutions
This document provides a guide to successfully implementing Salesforce. It outlines 5 key steps: 1) Plan and prepare which includes building an implementation team, setting metrics, designing system architecture and establishing timelines. 2) Setup and customize Salesforce CRM including building a prototype and deciding customization levels. 3) Deploy Salesforce CRM by adding users, defining access, importing data, and preparing for go-live. 4) Drive user adoption and 5) Maintain the system ongoing. Prerequisites, best practices, and common mistakes are also discussed to help ensure a smooth and successful Salesforce implementation.
How cognitive services can be used in various industriesSoftweb Solutions
We are gradually moving into the new era of app development, so in order to outperform your business competitors and have a thriving business you need to develop cognitive apps. These intelligent applications can interact with humans just like a human employee would do, understand natural language, identify hidden patterns and trends, learn from experiences and help you take evidence-based decisions.
For more details visit - http://www.softwebsolutions.com/resources/build-app-using-cognitive-service.html
The chatbot revolution poses a risk to apps. But are apps the only ones that are challenged by the bot revolution? Robots replacing humans has been a topic of discussion since past few years. But, will this question still prevail even when the artificially intelligent bots don’t have a physical form?
How Amazon Echo can be helpful for the healthcare industrySoftweb Solutions
Very soon doctors, nurses, patients and even pharmacists will start using Alexa in their personal and professional lives. Let's see how Alexa can be game-changer for healthcare professionals and patients.
Choosing a right IoT platform provider is not so difficult if you ask right questions to the provider before purchasing. So, here are the 8 questions that you must ask to an IoT platform provider before making a purchase.
Service Design takes UX further by involving more stakeholders and understanding their emotional journeys and reactions to a product or service. Learn more at http://www.softwebsolutions.com/resources/webinar-on-deep-dive-into-service-design.html
Leverage IoT to Setup Smart Manufacturing SolutionsSoftweb Solutions
The Internet of Things (IoT) is now to involve in manufacturing unit to deliver and enhance the productivity of companies through smart factory concept. It gives full business insights of manufacturing process and deliver data on their devices. View more at - http://www.softwebsolutions.com/iot-manufacturing-solutions.html
Sensors, Wearables and Internet of Things - The Dawn of the Smart EraSoftweb Solutions
IoT, Sensors, Wearables are having a huge impact on various areas including manufacturing, healthcare, retail, and logistics by bringing together data, people, process, and things. Visit http://www.softwebsolutions.com/internet-of-things-applications.html for details.
Automated solution with AWS to securely monitor and analyze various parameters of websites. Visit: http://blog.softwebsolutions.com/a-secure-and-scalable-motioning-solution-with-aws/ for more details.
Internet of Things will generate ample of value to business in coming years, visit http://www.softwebsolutions.com/trends/internet-of-things/internet-of-things-applications.html for more details.
Enterprise Mobility Solutions for Manufacturing IndustrySoftweb Solutions
Want to Simplify and standardize your business systems? Softweb Solutions provides perfect automation solutions for manufacturing industry to increase efficiency / productivity of employees and assets.
This document presents an Android app called Noti-Fi that uses WiFi signals for indoor positioning and notifications similar to iBeacons. It discusses how Noti-Fi does not require additional hardware or an SDK, works with existing WiFi infrastructure, has high accuracy of location detection, and is compatible with all device versions and operating systems. The team's roles are identified and it is noted that a major challenge was developing the indoor positioning algorithm using WiFi signals.
This document discusses using Node.js to build a song sharing application that allows users to upload songs, stream songs from a Node server, notify followers of song uploads in real-time, share songs with followers using social media, and uses modules like Express, Socket.io, and Nodemailer. Key aspects of the Node server architecture involve streaming songs to multiple clients and notifying clients of new uploads.
One of the Softweb Hackathon 2014 team presented "Beacon applications" - how Beacon can be used in Inventory Management, Library Application, Hospital Application.
Details of description part II: Describing images in practice - Tech Forum 2024BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and transcript: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
AC Atlassian Coimbatore Session Slides( 22/06/2024)apoorva2579
This is the combined Sessions of ACE Atlassian Coimbatore event happened on 22nd June 2024
The session order is as follows:
1.AI and future of help desk by Rajesh Shanmugam
2. Harnessing the power of GenAI for your business by Siddharth
3. Fallacies of GenAI by Raju Kandaswamy
Data Protection in a Connected World: Sovereignty and Cyber Securityanupriti
Delve into the critical intersection of data sovereignty and cyber security in this presentation. Explore unconventional cyber threat vectors and strategies to safeguard data integrity and sovereignty in an increasingly interconnected world. Gain insights into emerging threats and proactive defense measures essential for modern digital ecosystems.
How RPA Help in the Transportation and Logistics Industry.pptxSynapseIndia
Revolutionize your transportation processes with our cutting-edge RPA software. Automate repetitive tasks, reduce costs, and enhance efficiency in the logistics sector with our advanced solutions.
Blockchain and Cyber Defense Strategies in new genre timesanupriti
Explore robust defense strategies at the intersection of blockchain technology and cybersecurity. This presentation delves into proactive measures and innovative approaches to safeguarding blockchain networks against evolving cyber threats. Discover how secure blockchain implementations can enhance resilience, protect data integrity, and ensure trust in digital transactions. Gain insights into cutting-edge security protocols and best practices essential for mitigating risks in the blockchain ecosystem.
UiPath Community Day Kraków: Devs4Devs ConferenceUiPathCommunity
We are honored to launch and host this event for our UiPath Polish Community, with the help of our partners - Proservartner!
We certainly hope we have managed to spike your interest in the subjects to be presented and the incredible networking opportunities at hand, too!
Check out our proposed agenda below 👇👇
08:30 ☕ Welcome coffee (30')
09:00 Opening note/ Intro to UiPath Community (10')
Cristina Vidu, Global Manager, Marketing Community @UiPath
Dawid Kot, Digital Transformation Lead @Proservartner
09:10 Cloud migration - Proservartner & DOVISTA case study (30')
Marcin Drozdowski, Automation CoE Manager @DOVISTA
Pawel Kamiński, RPA developer @DOVISTA
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
09:40 From bottlenecks to breakthroughs: Citizen Development in action (25')
Pawel Poplawski, Director, Improvement and Automation @McCormick & Company
Michał Cieślak, Senior Manager, Automation Programs @McCormick & Company
10:05 Next-level bots: API integration in UiPath Studio (30')
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
10:35 ☕ Coffee Break (15')
10:50 Document Understanding with my RPA Companion (45')
Ewa Gruszka, Enterprise Sales Specialist, AI & ML @UiPath
11:35 Power up your Robots: GenAI and GPT in REFramework (45')
Krzysztof Karaszewski, Global RPA Product Manager
12:20 🍕 Lunch Break (1hr)
13:20 From Concept to Quality: UiPath Test Suite for AI-powered Knowledge Bots (30')
Kamil Miśko, UiPath MVP, Senior RPA Developer @Zurich Insurance
13:50 Communications Mining - focus on AI capabilities (30')
Thomasz Wierzbicki, Business Analyst @Office Samurai
14:20 Polish MVP panel: Insights on MVP award achievements and career profiling
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfjackson110191
These fighter aircraft have uses outside of traditional combat situations. They are essential in defending India's territorial integrity, averting dangers, and delivering aid to those in need during natural calamities. Additionally, the IAF improves its interoperability and fortifies international military alliances by working together and conducting joint exercises with other air forces.
Video traffic on the Internet is constantly growing; networked multimedia applications consume a predominant share of the available Internet bandwidth. A major technical breakthrough and enabler in multimedia systems research and of industrial networked multimedia services certainly was the HTTP Adaptive Streaming (HAS) technique. This resulted in the standardization of MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) which, together with HTTP Live Streaming (HLS), is widely used for multimedia delivery in today’s networks. Existing challenges in multimedia systems research deal with the trade-off between (i) the ever-increasing content complexity, (ii) various requirements with respect to time (most importantly, latency), and (iii) quality of experience (QoE). Optimizing towards one aspect usually negatively impacts at least one of the other two aspects if not both. This situation sets the stage for our research work in the ATHENA Christian Doppler (CD) Laboratory (Adaptive Streaming over HTTP and Emerging Networked Multimedia Services; https://athena.itec.aau.at/), jointly funded by public sources and industry. In this talk, we will present selected novel approaches and research results of the first year of the ATHENA CD Lab’s operation. We will highlight HAS-related research on (i) multimedia content provisioning (machine learning for video encoding); (ii) multimedia content delivery (support of edge processing and virtualized network functions for video networking); (iii) multimedia content consumption and end-to-end aspects (player-triggered segment retransmissions to improve video playout quality); and (iv) novel QoE investigations (adaptive point cloud streaming). We will also put the work into the context of international multimedia systems research.
GDG Cloud Southlake #34: Neatsun Ziv: Automating AppsecJames Anderson
The lecture titled "Automating AppSec" delves into the critical challenges associated with manual application security (AppSec) processes and outlines strategic approaches for incorporating automation to enhance efficiency, accuracy, and scalability. The lecture is structured to highlight the inherent difficulties in traditional AppSec practices, emphasizing the labor-intensive triage of issues, the complexity of identifying responsible owners for security flaws, and the challenges of implementing security checks within CI/CD pipelines. Furthermore, it provides actionable insights on automating these processes to not only mitigate these pains but also to enable a more proactive and scalable security posture within development cycles.
The Pains of Manual AppSec:
This section will explore the time-consuming and error-prone nature of manually triaging security issues, including the difficulty of prioritizing vulnerabilities based on their actual risk to the organization. It will also discuss the challenges in determining ownership for remediation tasks, a process often complicated by cross-functional teams and microservices architectures. Additionally, the inefficiencies of manual checks within CI/CD gates will be examined, highlighting how they can delay deployments and introduce security risks.
Automating CI/CD Gates:
Here, the focus shifts to the automation of security within the CI/CD pipelines. The lecture will cover methods to seamlessly integrate security tools that automatically scan for vulnerabilities as part of the build process, thereby ensuring that security is a core component of the development lifecycle. Strategies for configuring automated gates that can block or flag builds based on the severity of detected issues will be discussed, ensuring that only secure code progresses through the pipeline.
Triaging Issues with Automation:
This segment addresses how automation can be leveraged to intelligently triage and prioritize security issues. It will cover technologies and methodologies for automatically assessing the context and potential impact of vulnerabilities, facilitating quicker and more accurate decision-making. The use of automated alerting and reporting mechanisms to ensure the right stakeholders are informed in a timely manner will also be discussed.
Identifying Ownership Automatically:
Automating the process of identifying who owns the responsibility for fixing specific security issues is critical for efficient remediation. This part of the lecture will explore tools and practices for mapping vulnerabilities to code owners, leveraging version control and project management tools.
Three Tips to Scale the Shift Left Program:
Finally, the lecture will offer three practical tips for organizations looking to scale their Shift Left security programs. These will include recommendations on fostering a security culture within development teams, employing DevSecOps principles to integrate security throughout the development
Quality Patents: Patents That Stand the Test of TimeAurora Consulting
Is your patent a vanity piece of paper for your office wall? Or is it a reliable, defendable, assertable, property right? The difference is often quality.
Is your patent simply a transactional cost and a large pile of legal bills for your startup? Or is it a leverageable asset worthy of attracting precious investment dollars, worth its cost in multiples of valuation? The difference is often quality.
Is your patent application only good enough to get through the examination process? Or has it been crafted to stand the tests of time and varied audiences if you later need to assert that document against an infringer, find yourself litigating with it in an Article 3 Court at the hands of a judge and jury, God forbid, end up having to defend its validity at the PTAB, or even needing to use it to block pirated imports at the International Trade Commission? The difference is often quality.
Quality will be our focus for a good chunk of the remainder of this season. What goes into a quality patent, and where possible, how do you get it without breaking the bank?
** Episode Overview **
In this first episode of our quality series, Kristen Hansen and the panel discuss:
⦿ What do we mean when we say patent quality?
⦿ Why is patent quality important?
⦿ How to balance quality and budget
⦿ The importance of searching, continuations, and draftsperson domain expertise
⦿ Very practical tips, tricks, examples, and Kristen’s Musts for drafting quality applications
https://www.aurorapatents.com/patently-strategic-podcast.html
Performance Budgets for the Real World by Tammy EvertsScyllaDB
Performance budgets have been around for more than ten years. Over those years, we’ve learned a lot about what works, what doesn’t, and what we need to improve. In this session, Tammy revisits old assumptions about performance budgets and offers some new best practices. Topics include:
• Understanding performance budgets vs. performance goals
• Aligning budgets with user experience
• Pros and cons of Core Web Vitals
• How to stay on top of your budgets to fight regressions
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/07/intels-approach-to-operationalizing-ai-in-the-manufacturing-sector-a-presentation-from-intel/
Tara Thimmanaik, AI Systems and Solutions Architect at Intel, presents the “Intel’s Approach to Operationalizing AI in the Manufacturing Sector,” tutorial at the May 2024 Embedded Vision Summit.
AI at the edge is powering a revolution in industrial IoT, from real-time processing and analytics that drive greater efficiency and learning to predictive maintenance. Intel is focused on developing tools and assets to help domain experts operationalize AI-based solutions in their fields of expertise.
In this talk, Thimmanaik explains how Intel’s software platforms simplify labor-intensive data upload, labeling, training, model optimization and retraining tasks. She shows how domain experts can quickly build vision models for a wide range of processes—detecting defective parts on a production line, reducing downtime on the factory floor, automating inventory management and other digitization and automation projects. And she introduces Intel-provided edge computing assets that empower faster localized insights and decisions, improving labor productivity through easy-to-use AI tools that democratize AI.
What Not to Document and Why_ (North Bay Python 2024)Margaret Fero
We’re hopefully all on board with writing documentation for our projects. However, especially with the rise of supply-chain attacks, there are some aspects of our projects that we really shouldn’t document, and should instead remediate as vulnerabilities. If we do document these aspects of a project, it may help someone compromise the project itself or our users. In this talk, you will learn why some aspects of documentation may help attackers more than users, how to recognize those aspects in your own projects, and what to do when you encounter such an issue.
These are slides as presented at North Bay Python 2024, with one minor modification to add the URL of a tweet screenshotted in the presentation.
How Netflix Builds High Performance Applications at Global ScaleScyllaDB
We all want to build applications that are blazingly fast. We also want to scale them to users all over the world. Can the two happen together? Can users in the slowest of environments also get a fast experience? Learn how we do this at Netflix: how we understand every user's needs and preferences and build high performance applications that work for every user, every time.
How to Avoid Learning the Linux-Kernel Memory ModelScyllaDB
The Linux-kernel memory model (LKMM) is a powerful tool for developing highly concurrent Linux-kernel code, but it also has a steep learning curve. Wouldn't it be great to get most of LKMM's benefits without the learning curve?
This talk will describe how to do exactly that by using the standard Linux-kernel APIs (locking, reference counting, RCU) along with a simple rules of thumb, thus gaining most of LKMM's power with less learning. And the full LKMM is always there when you need it!
How to Avoid Learning the Linux-Kernel Memory Model
Big Data in Action : Operations, Analytics and more
1. What is Better Alert?Big Data in Action: Operations, Analytics and more
2. Agenda
• Meet & Greet Introduction.
• Unfolding the term “Big Data”.
– Evolution of Data to Big Data : Static to Stream.
– 3 V’s of Big Data.
• Overview of Implementing Big Data
– Examples of implementation of Big Data
– Implementing Big data with Hadoop infrastructure
– Implementing Big data with NoSql like Cassandra & MongoDB.
• Advantages of implementing Big Data solutions.
• Open Forum Discussion/ Networking.
3. Vibhu Bhutani
Technical Project Manager
Started as a Java developer, and I have many years of experience in developing and managing
state of the art applications. With extensive experience in the phases of the SDLC model, I
leads the team of innovations & mobile excellence in softweb soloutions. Am involved in
various innovative implementations which include the implementation of Big Data systems,
IOT implementations and iBeacon developments at Softweb Solutions.
in/vibhuis
Welcome
4. Unfolding the Term Big Data
• IBM reported in a study that every day we create roughly 2.5 quintillion data from various
data sources like Climate Sensors, GPS Signals, Social Media, Online transactions. Out of
which 90% was created in the last couple of years. Big Data is a buzz word of a technology
that shows a potential to process, huge amount of data so that we get some valuable
information out of it.
• How old is Big Data?
– Its as old as data however the parameters changes every year. In 2012 it was about couple of
Petabytes and now its about few Exabyte's.
• Why do we now here about Big Data?
– Although big data is old, but now a days more industries are knowing about the implications of
big data. In 2004 Google introduced a paper explaining Map Reduce technique to analyze large
datasets. After that many other companies joined together and the buzz word Big Data came
into existence.
• Static data VS Dynamic Data
5. Evolution of Data
In 76 KB of
Hardwired
Memory, Nasa
successfully took
Men to moon and
brought them back.
With an 8 Gigs
iPhone it can be
done 108 times.
Strange Fact
10. Application of Big Data - Cern
• In 1960 Cern used to store data in a main frame
computer.
• In 1970 cern used to distribute data in several
machines dividing mainframe computer into a
smaller piece of equipments and cern net was
introduced to bridge these machine and travel
was reduced.
• In 1980 these machines were placed in different
countries of US and Europe and internet was
introduced to connect these machines.
• Due to enormous increase of data in 2000 a cern
grid was introduced connecting different smaller
computers together to analyze and process the
data.
• Detector with 150 million sensors are used in LHC
where protons collides at a light speed works as a
3D camera where pictures are by a rate of 40
million times per second. The data is now stored
in cloud and analyzed using big data techniques.
11. Implementation of Big Data - Cern
Proton injection for collision Collision of particles recording data
in sensors
12. Other Industries using Big Data
• Government Application:
– US government invested a lot in the big data applications. Big data analysis played a large role
in Barack Obama's successful 2012 re-election campaign.
– The Utah Data Center is a data center currently being constructed by the United States National
Security Agency. The exact amount of storage space is unknown, but more recent sources claim it
will be on the order of a few exabytes.
– Big data analysis was, in parts, responsible for the BJP and its allies to win a highly successful Indian
General Election 2014.
– UK government is utilizing big data to improv weather forecasting & new drug release forecasts.
• Manufacturing Industries:
• Vast amount of sensory data such as acoustics, vibration, pressure, current, voltage and controller
data in addition to historical data construct the big data in manufacturing. The generated big data
acts as the input into predictive tools and preventive strategies.
• Technology Industries:
• Ebay and Amazon are industry leaders for maintaining large amount of user searches and predictive
analysis. This helps to identify user needs and provide them with better results.
• Retail Industries:
• Walmart contains about 2.5 peta bytes of data handling 1 million customer transaction every hour.
• Amazon does a transaction of USD 80,000 in an hour. Amazon has worlds three largest databases.
13. Big Data Solutions - Hadoop
• Hadoop is an open-source system to reliably
store and process lot of information.
• Solution of Big Data that handles complexity
involved in volume, variety and velocity of data.
• It transform the commodity hardware to services
to handle peta bytes of data into distributed
environments: Pigeon Computing.
• Hadoop is redundant , reliable, powerful, batch
process centric, distributed.
16. Hadoop Implementation in Real World
• Yahoo:
– In 2008 Yahoo claimed world’s largest hadoop prodcution application. Yahoo Search Webmap is a hadoop
application that runs on Linux with more that 10,000 cores.
• Facebook:
– In 2010 Facebook claimed that they had the largest Hadoop cluster in the world with 21 PB of storage. On
June 13, 2012 they announced the data had grown to 100 PB] On November 8, 2012 they announced the
data gathered in the warehouse grows by roughly half a PB per day
• As of 2013, Hadoop adoption is widespread. For example, more
than half of the Fortune 50 use Hadoop.
• The New York Times used 100 Amazon EC2 instances and a
Hadoop application to process 4 TB of raw image TIFF data (stored
in S3) into 11 million finished PDFs in the space of 24 hours at a
computation cost of about $240 (not including bandwidth)
18. Introduction to No SQL
• A NoSQL database provides a mechanism
for storage and retrieval of data that is modeled in means other
than the tabular relations used in relational databases
• Types of NoSQL Databases:
– Column: Cassandra, HBase
– Document: Apache CouchDB, MongoDB
– Key-value: CouchDB, Dynamo, Redis
– Graph: Neo4J
– Multi-model: OrientDB, Alchemy Database, CortexDB
19. High Level Architecture - Cassandra
• Ring based replication
• Only 1 type of server (cassandra)
• All nodes hold data and can answer queries
• No Single Point of Failure
• Build for HA & Scalability
• Multi-DC
• Data is found by key (CQL)
• Runs on JVM
21. High Level Architecture - Cassandra
Example: Single Row Partition
• Simple User system
• Identified by name (pk)
• 1 Row per partition
22. High Level Architecture - Cassandra
Example: Multiple Rows
• Comments on photos
• Comments are always selected by
the photo_id
• There are only 4 rows in 2 partitions
23. High Level Architecture - Cassandra
• Multiple rows are transposed into a single partition
• Partitions vary in size
• Old terminology - "wide row"
• Cassandra is built for fast write. The data model should be deformalize to do few Reads as
possible
24. High Level Architecture – Mongo DB
• Open-source, Document-oriented, popular for its
agile and scalable approach
• Notable Features :
– JSON/BSON data model with dynamic schema
– Auto-sharding for horizontal scalability
– Built-in replication with automated fail-overs
– Full, flexible index support including secondary
indexes
– Rich document-based queries
– Aggregation framework and Map / Reduce
– GridFS for large file storage
25. High Level Architecture – Mongo DB
• Ensures High Availability, Redundancy, Automated
Fail-over
• Writes to the Primary, Reads from all
• Asynchronous replication
• In conventional terms, more like Master/Slave
replication
• Members can be configured to be: Secondary only
/ Non- Voting / Hidden / Arbiters / Delayed
26. When to use : Mongo DB
• Unstructured data from multiple suppliers
• GridFS : Stores large binary objects
• Spring Data Services
• Embedding and linking documents
• Easy replication set up for AWS
4. Example of streaming data: If there is a application searching of some text in the emails that we send. Emails can be considered as a stream of data, algorithms work to get some text identification done on the basis of specific pattern and send’s an alert if something is found. Now a days many government agencies are working on these kind of stuffs.
5. The image shows how the data was evolved. Archaeology findings shows that around 2000 BC Phaistos Disc were getting used to store the information. These were the clay discs which embeds the data and store it for a long period of time. Later people used wrote things in pyramids following by store tabs.
6. Necessity is the mother of invention. Human brain always want to know more and to know more, we need to process more. Information Era gave us the data, and to process this data we created big data.
7. Characteristics of Bid Data consists of 3V. Volume, Variety and Velocity. Volume represents the bulk and size of data. Every decade the definition of big data changes. Previously it was hard to store KB’s of data but now we are storing huge amounts of data on a smartphone. The image shows the amount of data that is getting stored in different parts of the world.
Next comes variety, it’s the categorization of big data. By categorizing data we make it easy for data analyst to group some inter dependent data and get some advantage out of it.
8. Velocity represents the speed of generating this data. The image shows by what speed we generate this data.
Its really to think what happens with this enormous amount of data that we are generating and this leads to the 4th V.
9. Value. What is the value of analyzing the data. The image shows how the various industries are utilizing & analyzing this data. Apart from the monetary benefits, many other fields like machine learning, scientific experiments, medicine etc. are benefited by Big Data.
10. In 1962 Arthur Samuel wrote a computer program to play checkers. The program got defeated initially but later Samuel wrote a sub program to analyze the board and compute the plays for wining. When the sub program got linked with the checkers program the computer started to win. This was a first incident of artificial intelligence where the data generated by the computer was recorded and used to plan the moves.
11.
12. Some car manufactures are gathering data from the sensors on the drivers seat, they are identifying the pattern when a driver feels sleepy and informs the driver by vibrating the steering. The same technology is getting used to identify the theft based on sitting patterns.
13.
14. Map reduce is the processing part, it runs the computation and return the results.
Second part is HDFS. It stores all the data, with files and directories and is highly scalable and distributed.
15. This is a classic map reduce program for word count.
16.
17. In theoretical computer science, the CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
Consistency (all nodes see the same data at the same time)
Availability (a guarantee that every request receives a response about whether it succeeded or failed)
Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)
18. A column of a distributed data store is a NoSQL object of the lowest level in a keyspace. It is a tuple (a key-value pair) consisting of three elements, Unique name, value & timestramp.
Document: A trivial example would be scanning paper documents, extracting the title, author, and date from them either by OCR or having a human locate and enter them, and storing each document in a 4-column relational database, the columns being author, title, date, and a blob full of page images
Key Value: an associative array, map, symbol table, or dictionary is an abstract data type composed of a collection of pairs, such that each possible key appears just once in the collection.
Graph: graph database is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data
19.
20.
21.
22.
23.
24.
25.
26.
27. Not to say there are some disadvantages too:
Issues with finding the right talent.
Issue with finding the proper use case.
Impact on white colar jobs due to high needs of data scientists.
Analyzing and finding out Good Data from Big Data.