Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Automate Your Data, Free
Your Mind
Aaron Swerlein
• About Me
• CRUD
• ETL
• Testing
• Benefits of Automation
• Tools
• Examples
• Questions
Contents
2
• I have a 7 year old daughter.
• I enjoy playing card games (Euchre, Rummy, Uno, MTG)
• Love watching movies -> /(H|B)ollywood/
• I am an identical twin.
About Me
3
Introduction | CRUD
4
Introduction | ETL Flow
5
Arrange
• Source file/table
• Front end
application
Action
• Stored Procedure,
SQL script
• Submitting form
on front end
Assert
• Data in target
Introduction | Testing
6
Database Testing
7
Database Testing | CRUD
8
Create
• New groups
or users
Read
• Connect to
database
• Validate
tables/views
Update
• Change
existing
groups or
accesses
Delete
• Inability to
connect
Table Testing
9
Table Testing | CRUD
10
Create
• New column,
constraint,
relationship
Read
• Validate
existence of
table
Update
• Table changes
Delete
• Removing
constraints
• Removing
Columns
• Removing
relationships
Data Testing
11
Data Testing | CRUD
12
Create
• Inserted record
is present
Read
• Can access data
Update
• Record is
present, but
original record
no longer there
Delete
• Record no
longer exists
Benefits of Automation | Time
13
Benefits of Automation | Coverage
14
Benefits of Automation | Effort
15
• Image from Bahubali
Benefits of Automation | Deployment
16
Benefits of Automation |Jenkins
17
Benefits of Automation | Jenkins
18
Benefits of Automation | Jenkins
19
Benefits of Automation | Living Documentation
20
Benefits of Automation | Persistence
21
Scenario: validating speaker at QA or the Highway
When the user navigates to the qa or highway speakers webpage
Then the user validates the speaker Kyle Swerlein is present
Tools
Ruby/Cucumber
23
• Ruby
– Expressive language
– Good flexibility
• Cucumber
– Allows for plain English scenarios to express requirements.
– Helps bridge the developer – analyst knowledge gap
Cucumber Example
24
@regression
Scenario: validating test table loaded after jobs run
Given data is present in the source
When job 1 is executed
And job 2 is executed
And job 3 is executed
Then test table is loaded
Ruby Gems
25
• Active Record:
– Default for Ruby on Rails
– Large bulky library, includes all of functionality
• Sequel:
– Lighter gem
– Uses adapters and plugins to add functionality to core gem
• Other:
– IBM_DB, OCI8, ADO
• Sequel is a RubyGem which allows a user to connect to a
database.
• It includes a comprehensive ORM layer for mapping records to
Ruby objects and handling associated records. (Active-Record
Design Pattern)
• Sequel has adapters for ADO (access), IBM_DB, JDBC,
MySQL, ODBC, Oracle, Hadoop, and many others
Ruby Gem - Sequel
26
ORM (Object Relational Mapping)
27
• Sequel.connect()
Sequel.connect(adapter: 'adapter', database: ‘database',
username: 'username', password: 'password‘, port: ’port’)
Sequel.connect("#{adapter}://#{username}:#{password}@#{hostname}:#{
port}/#{database}"
• Sequel.ado()
• $connection
Sequel Connection Examples
28
Sequel.ado(conn_string: "Provider=Microsoft.ACE.OLEDB.12.0;
Data Source=C:/db/test_location.accdb")
Creating Table classes
29
class TableName <
Sequel::Model($oracle_connection[“database_name__table_name".to_sym])
set_primary_key :primary_key
end
class TableName < Sequel::Model(
$oracle_connection[Sequel.qualify(“database_name”, “table_name")])
set_primary_key :primary_key
end
Sequel.split_symbols = true
Querying with Sequel
30
• Once connected to a Database you can query a couple of
different ways.
– Straight SQL statement
– Sequel ruby methods
Sequel Using SQL
31
• The Sequel library has different methods which can be called
passing in a SQL query. To do this, you need to call the
instantiated connection object you created i.e. $connection
– $connection.run(‘sql’)
– $connection.fetch(‘sql’)
Sequel Using Ruby
32
where/exclude
order/reverse
select
Joininsert
update
delete
SQL VS. SEQUEL | Create
33
TableName.insert(presenter_name: 'Aaron Swerlein',
conference_name: 'QA or the Highway 2017',
talk_time: '2017-02-27 11:05:00.000000')
SQL VS. SEQUEL | Read
34
TableClass.select(:column_a, :column_b)
.where(column_a: 'text')
.where(column_b: [0,9])
.group(:column_a)
.order(:column_a)
SQL VS. SEQUEL | Update
35
TableName.where(presenter_name: 'Aaron Swerlein')
.where(conference_name: 'QA or the Highway')
.update(talk_time: '2018-02-27 11:05:00.000000')
SQL VS. SEQUEL | Delete
36
TableName.where(presenter_name: 'Aaron Swerlein')
.where(conference_name: 'QA or the Highway').delete
Helpful Sequel library
37
• A built in Sequel library which is very useful is the inflector
library. This allows for basic string manipulations which can
make testing easier. Here are some examples and usages.
method input output
camelize/camelcase ‘test_class’ ‘TestClass’
constantize ‘test_class’ TestClass
underscore ‘TestClass’ ‘test_class’
Examples
Test Samples: Ruby
39
Automation Demo
40
Helpful links
41
http://sequel.jeremyevans.net/rdoc/
https://github.com/jeremyevans/sequel
https://github.com/jeremyevans/sequel/blob/master/doc/querying.rdoc
https://twin.github.io/ode-to-sequel/
Questions
42
Contact Information
43
• Email
– aswerlein511@gmail.com
• LinkedIn
– https://www.linkedin.com/in/aaron-swerlein-160b5269
• GitHub
– https://github.com/aswerlein511/SequelATDDDemo
Version
44
DATE
AUTHOR VERSION REASON
01/17/2017 Aaron Swerlein 1.0 Created document /draft

More Related Content

Automate Your Data, Free Your Mind by Aaron Swerlein

Editor's Notes

  1. CRUD operations are the basis of all ETL testing. (Extract, Transform, Load). Not only limited to Database testing. Andy presented on Services which hit on Post (create), Read, Update, Delete
  2. A traditional flow in the ETL process can be broken down with Behavior Driven Actions (BDD) Arranging data (creating source data, inserting seed record, filling out a web page) Running the ETL process (Stored Procedure, BTEQ, SQL script, Submitting web page) Validating results in target as expected (database, file)
  3. The first triangle speaks to the number of tests which will be written for the various aspects of ‘Database’ testing. Similar to the ‘testing triangle’. A good strategy is to dive in from the top and work our way down to the Data testing, which is the most time consuming.
  4. The database is traditionally the highest level to look at when looking into testing related to Data. Things which can be tested from this level, would be: Table/views/procedure all exist as expected Different access groups exist Those groups accesses to the specific database are as expected
  5. Table testing lends to many more test cases that are possible. Those include: Table structure is correct Columns exist (names are correct) Columns defined correctly (defaults, size, data type, language, format) Primary/Foreign Keys exist Relationships between tables are present Constraints are built correctly
  6. Data testing has a literal infinite number of tests which could be considered. Those include: Column values Column relationships Foreign key relationship with other tables Date chaining (transaction management)
  7. Creation: Data refreshes PII (personal identifiable information) masking Actual migration of data between environments Creating data to meet scenario coverage Maintain: Reset updates to data back to default state (rollback) Removal: Leaving the database in same state before we started testing (ninja testing)
  8. There are an infinite number of data scenarios which can be tested, all of which are not valid scenarios. PROD data might not cover all scenarios, and being able to manipulate existing data or create our own for testing will ensure those scenarios are covered Automating that data creation/destruction process will save time for anyone working with data setup.
  9. Anytime we can reduce the burden of test execution from testers, we give them more time to focus on other ‘household’ items. Refactoring, doing more exploratory or vulnerability tests.
  10. As soon as code is promoted to a higher environment, the high level checks can be run automatically to ensure everything promoted correctly, without the need of manual intervention.
  11. Creation: Data refreshes PII (personal identifiable information) masking Actual migration of data between environments Creating data to meet scenario coverage Maintain: Reset updates to data back to default state (rollback) Removal: Leaving the database in same state before we started testing (ninja testing)
  12. Creation: Data refreshes PII (personal identifiable information) masking Actual migration of data between environments Creating data to meet scenario coverage Maintain: Reset updates to data back to default state (rollback) Removal: Leaving the database in same state before we started testing (ninja testing)
  13. Creation: Data refreshes PII (personal identifiable information) masking Actual migration of data between environments Creating data to meet scenario coverage Maintain: Reset updates to data back to default state (rollback) Removal: Leaving the database in same state before we started testing (ninja testing)
  14. Any test tracking tool (QC, TFS, MTM) is not needed to be relied on so heavily to track test cases, because the automation suite does that for you. The automation suite is the documentation and the results can be persisted in a DB.
  15. Any test tracking tool (QC, TFS, MTM) is not needed to be relied on so heavily to track test cases, because the automation suite does that for you. The automation suite is the documentation and the results can be persisted in a DB.
  16. Personally I have worked with Hadoop, Teradata, Oracle, DB2, MSSQL, and Microsoft Access. All of them except Access use the connect method where Access uses the ado method. Assign each statement to it’s own object. To allow multiple connections.
  17. Prior to Sequel 5.0 the syntax for defining tables was DB__Tablename. Primary key is optional. Database reference ($connection) is needed to find table. Split symbols allows legacy code to keep the ‘__’.
  18. When trying to actually mock an ETL process run is going to be a better option. This is because we can prepare data, run a stored procedure of some sort of ETL flow, and validate the result without caring about the result of the ETL process.