Redis Introduction and customized framework base on StackExchange.Redis but update to using singleton pattern and JSON
Configuration Mapping with Redis Instance Group and Name concept.
3. Who am I
Blackie Tsai
Senior IT consultant of Xuenn
Full stack developer
Major on development of real-time transaction system with low latency and high concurrent
Learning CI&CD and run with Agile&LEAN
Blog
http://www.dotblogs.com.tw/blackie1019
5. Redis
Redis - A.K.A Remote Directory Server. It is an open source (BSD licensed), one of Key-Value database of
NoSQL, in-memory data structure store, used as database, cache and message broker. It also can run atomic
operations.
Most Popular NoSQL(http://techstacks.io/)
Recommend using Linux for deploying.
Features
Pure
Simple
Single Tread
In-memory but persistent on disk database
Remote dictionary server
3.0.4 is the latest stable version.
開源資料庫Redis實戰經驗大公開
7. https://clusterhq.com/assets/pdfs/state-of-container-usage-june-2015.pdf
Over 70% would like to run a database or other stateful service in their
container environments, with MySQL and Redis the two leading choices
Important features for data management in container solutions were:
Integration of data management capabilities into existing container work flow and
tools
seamless movement of data between dev, test and production environments.
8. Who using Redis
Facebook’s Instagram: Making the Switch to Cassandra from Redis, a 75% ‘Insta’ Savings
http://techstacks.io/tech/redis、http://redis.io/topics/whos-using-redis
10. Redis - Data Persistence
Master
Redis
Slaver
Redis
Persistence to Disk
Server A
Server B
Replication
IMPLEMENTING PERSISTENCE IN REDIS
• Master instance with no persistence
• Slave instance with AOF enabled
12. Redis - Insight Of Pit
Server-side session with Redis
Redis has many eviction policies, but most of them are based on 'sampling‘.
Alternative Solution
Use Database as an another back-end
Use Redis 3.0
Maximize CPUs usage
Redis is single thread. One instance usually only use one CPU
Redis, another step on the road
16. Stack Exchange
The world’s largest programming
community is growing
Stack Exchange is a network of 130+ Q&A communities
including Stack Overflow
Global traffic ranking 54th largest website
18. Stack Exchange - Info
Stack Overflow still uses Microsoft products.
Stack Overflow still uses a scale-up strategy with HA.
SQL Servers loaded with 384 GB of RAM and 2TB of SSD.
Stats
4 million users, 8 million questions, 40 million answers, 560 million pageviews a month.
Peak is more like 2600-3000 requests/sec on most weekdays.
25 servers, Stack Overflow has a 40:60 read-write ratio.
2 TB of SQL data all stored on SSDs, Each web server has 2x 320GB SSDs in a RAID 1.
DB servers average 10% CPU utilization, 11 web servers, using IIS.
2 load balancers, 1 active, using HAProxy
4 active database nodes, using MS SQL
2 machines for distributed cache and messaging using Redis
2 read-only SQL Servers for used mainly for the Stack Exchange API
3 machines doing search with ElasticSearch
19. Stack Exchange - Caching
Caching
Cache all the things.
5 levels of caches.
1st:
Caching in the browser, CDN, and proxies.
2nd:
Using HttpRuntime.Cache. An in-
memory, per server cache.
3rd:
Redis.
4th:
SQL Server
Cache.
5th:
SSD.
20. Stack Exchange - Lessons Learned
Why use Redis if you use MS products?
gabeech: It's not about OS evangelism. We run things on the platform they run best on. Period. C# runs best on a
windows machine, we use IIS. Redis runs best on a *nix machine we use *nix.
Overkill as a strategy
SSDs Rock
Know your read/write workload
Keeping things very efficient means new machines are not needed often
Don’t be afraid to specialize
Do only what needs to be done
Reinvention is OK
Go down to the bare metal
No bureaucracy.
Garbage collection driven programming
The cost of inefficient code can be higher than you think
21. StackExchange.Redis
Basic Usage - getting started and basic usage
Configuration - options available when connecting to redis
Pipelines and Multiplexers - what is a multiplexer?
Keys, Values and Channels - discusses the data-types used on the API
Transactions - how atomic transactions work in redis
Events - the events available for logging / information purposes
Pub/Sub Message Order - advice on sequential and concurrent processing
Scripting - running Lua scripts with convenient named parameter replacement
https://github.com/StackExchange/StackExchange.Redis
25. RedisDemo
Features
Connection Mapping with Configuration
Configuration with Redis Instance Group and Name concept supported
Singleton pattern avoid resource waste
Dependency
StackExchange.Redis
FX.Configuration
Newtonsoft.Json
Log4Net(Optional)
Demo Version
https://github.com/blackie1019/RedisDemo
30. Redis Data Management
Cross-platform redis desktop manager - desktop management GUI for mac OS X, Windows, Debian and
Ubuntu.
http://redisdesktop.com
32. Redis Server with Docker
Enable Virtualization Technology on Bios and install Docker Toolbox
Create a Docker container for Redis
Run the service
Create your web application container
If any problem you can remove and setup again
docker-machine rm default
docker-machine --native-ssh create -d virtualbox default
Dockerizing a Redis service
33. Reference
Scaling Stack Overflow (QCon NYC 2015)
Redis, another step on the road
Types of NoSQL databases
StackOverflow Update: 560M Pageviews A Month, 25 Servers, And It's All About Performance
Redis 设计与实现
《Redis 设计与实现》图片集
Editor's Notes
The Kiss Principle (儉樸原則)
說 ‘Less is more’ (少即是多) 的哲學。KISS is an acronym for the design principle
“Keep it simple, Stupid!“
“keep it short and simple”
“keep it simple and straightforward“.
簡純
ANSI C 撰寫。
幾乎不依賴第三方函式庫。
memcached 使用 libevent ,程式碼龐大。
Redis 參考 libevent 實現了自己的 epoll event loop 。
KISS 原則
每個數據結構只負責自己應當做的。
簡單
No map-reduce.
No indexes.
No vector clocks.
單執行緒
No thread context switch.
No thread race condition.
No other complicated condition
記憶體資料庫,但可永久儲存於硬碟中
記憶體操作資料。
資料可永久儲存於硬碟。
不只是快取伺服器
Queue
DevOps.com & ClusterHQ.com所統計
This report is based on the current and planned container usage patterns of 285 respondents. The survey was conducted over the latter half of May 2015.
Consider a setup as shown in the preceding image; that is:
Master instance with no persistence
Slave instance with AOF enabled
In this case, the master does not need to perform any background disk operations and is fully dedicated to serve client requests, except for a trivial slave connection. The slave server configured with AOF performs the disk operations. As mentioned before, this file can be used to restore the master in case of a disaster.
Persistence in Redis is a matter of configuration, balancing the trade-off between performance, disk I/O, and data durability. If you are looking for more information on persistence in Redis, you will find the article by Salvatore Sanfilippo at http://oldblog.antirez.com/post/redis-persistence-demystified.html interesting.
Teams:
SRE (System Reliability Engineering): - 5 people
Core Dev (Q&A site) : ~6-7 people
Core Dev Mobile: 6 people
Careers team that does development solely for the SO Careers product: 7 people
1st: Caching in the browser, CDN, and proxies.
2nd: Using HttpRuntime.Cache. An in-memory, per server cache.
3rd: Redis.
4th: SQL Server Cache.
5th: SSD.
For example, every help page is cached. Code to access a page is very terse:
Static methods and static classes re used. Really bad from an OOP perspective, but really fast and really friendly towards terse code. All code is directly addressed.
Caching is handled by a library layer of Redis and Dapper, a micro ORM.
To get around garbage collection problems, only one copy of a class used in templates are created and kept in a cache. Everything is measured, including GC operation, from statistics it is known that layers of indirection increase GC pressure to the point of noticeable slowness.
CDN hits vary, since the query string hash is based on file content, it’s only re-fetched on a build. It's typically 30-50 million hits a day for 300 to 600 GB of bandwidth.
A CDN is not used for CPU or I/O load, but to help users find answers faster.
Why use Redis if you use MS products?
gabeech: It's not about OS evangelism. We run things on the platform they run best on. Period. C# runs best on a windows machine, we use IIS. Redis runs best on a *nix machine we use *nix.
Overkill as a strategy.
Nick Craver on why their network is over provisioned: Is 20 Gb massive overkill? You bet your ass it is, the active SQL servers average around 100-200 Mb out of that 20 Gb pipe. However, things like backups, rebuilds, etc. can completely saturate it due to how much memory and SSD storage is present, so it does serve a purpose.
SSDs Rock.
The database nodes all use SSD and the average write time is 0 milliseconds.
Know your read/write workload.
Keeping things very efficient means new machines are not needed often.
Only when a new project comes along that needs different hardware for some reason is new hardware added. Typically memory is added, but other than that efficient code and low utilization means it doesn't need replacing. So typically talking about adding a) SSDs for more space, or b) new hardware for new projects.
Don’t be afraid to specialize.
SO uses complicated queries based on tags, which is why a specialized Tag Engine was developed.
Do only what needs to be done.
Tests weren’t necessary because an active community did the acceptance testing for them. Add projects only when required. Add a line of code only when necessary. You Aint Gone Need It really works.
Reinvention is OK.
Typical advice is don’t reinvent the wheel, you’ll just make it worse, by making it square, for example. At SO they don't worry about making a "Square Wheel". If developers can write something more lightweight than an already developed alternative, then go for it.
Go down to the bare metal.
Go into the IL (assembly language of .Net). Some coding is in IL, not C#. Look at SQL query plans. Take memory dumps of the web servers to see what is actually going on. Discovered, for example, a split call generated 2GB of garbage.
No bureaucracy.
There’s always some tools your team needs. For example, an editor, the most recent version of Visual Studio, etc. Just make it happen without a lot of process getting in the way.
Garbage collection driven programming.
SO goes to great lengths to reduce garbage collection costs, skipping practices like TDD, avoiding layers of abstraction, and using static methods. While extreme, the result is highly performing code. When you're doing hundreds of millions of objects in a short window, you can actually measure pauses in the app domain while GC runs. These have a pretty decent impact on request performance.
The cost of inefficient code can be higher than you think.
Efficient code stretches hardware further, reduces power usage, makes code easier for programmers to understand.
Renaming Commands
A slightly unusual feature of redis is that you can disable and/or rename individual commands. As per the previous example, this is done via the CommandMap
Twemproxy
is a tool that allows multiple redis instances to be used as though it were a single server, with inbuilt sharding and fault tolerance (much like redis cluster, but implemented separately).