Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Plugin-based software design
with Ruby and RubyGems
Sadayuki Furuhashi

Founder & Software Architect
RubyKaigi 2015
A little about me…
Sadayuki Furuhashi
github: @frsyuki
Fluentd - Unifid log collection infrastracture
Embulk - Plugin-based parallel ETL Founder & Software Architect
It's like JSON.
but fast and small.
A little about me…
What’s Plugin Architecture?
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGems
Benefits of Plugin Architecture
> Plugins bring many features
> Plugins keep core software simple
> Plugins are easy to test
> Plugins builds active developer community
Benefits of Plugin Architecture
> Plugins bring many features
> Plugins keep core software simple
> Plugins are easy to test
> Plugins builds active developer community
> “…if it’s designed well”.


plugin architecture?
How to design


plugin architecture?
How did I design
How to design
Today’s topic
> Plugin Architecture Design Patterns
> Plugin Architecture of Fluentd
> Plugin Architecture of Embulk
> Pitfalls & Challenges
Plugin Architecture

Design Patterns
Plugin Architecture Design Patterns
a) Traditional Extensible Software Architecture
b) Plugin-based Software Architecture
Traditional Extensible Software Architecture
Host
Application
Plugin
Plugin
Register plugins
to extension points
To add more extensibility,
add more extension points.
Plugin-based software architecture
Core
Plugin
Plugin
Plugin Plugin Plugin
Plugin Plugin
Application
Plugin-based software architecture
• Application as a network of plugins.
> Plugins: provide features.
> Core: framework to implement plugins.
• More flexibility != More complexity.
• Application must be designed as modularized.
> It’s hard to design :(
> Optimizing performance is difficult :(
• Loosely-coupled API often makes performance
worse.
Design Pattern 1: Dependency Injection
Core
class
interface
class interface interface
class class A component is
an interface or

a class.
Each component publishes API:
Design Pattern 1: Dependency Injection
Core
class
Plugin
Plugin Plugin Plugin
class Plugin
When application runs:
A DI container

replaces objects
with plugins when
application runs
Replace classes
with mocks for
unit tests
Design Pattern 1: Dependency Injection
Core
dummy
dummy
dummy dummy dummy
Plugin dummy
Testing the application
Dependency Injection (Java)
public interface Store
{
void store(String data);
}
public class Module
{
@Inject
Module(Store store)
{
store.store();
}
}
public class DummyStore
implements Store
{
void store(String data) { }
}
public class MainModule
implements Module
{
public void configure(
Binder binder)
{
binder.bind(Store.class)
.to(DummyStore.class);
}
}
interface → implementation

mapping
From source code,
implementation is black box.
It’s replaced at runtime.
Dependency Injection (Ruby)
Ruby?
(What’s a good way to use DI in Ruby?)
(Please tell me if you know)
Dependency Injection (Ruby)
class Module
def initialize(store:
DummyStore.new)
store.store(”data”)
end
end
class DummyStore
def store(data)
end
end
injector = Injector.new.
bind(store: DBStore)
object = injector.get(Module)
class DBStore
def initialize(db: DBM.new)
@db = db
end
def store(data)
@db.insert(data)
end
end
injector = Injector.new.
bind(store: DBStore).
bind(db: SqliteDBImpl)
object = injector.get(Module)
I want to do this: Keyword arguments
{:keyword => class} mapping

at runtime
Design Pattern 2: Dynamic Plugin Loader
Core
Plugin Plugin
Calls Plugin loader
to load plugins
Plugin
Loader
Design Pattern 2: Dynamic Plugin Loader
Core
Plugin Plugin
Plugins also call Plugin Loader.
Plugins create an ecosystem.
Plugin
Loader
Plugin Plugin
Design Pattern 3: Combination
Core
class
Plugin
class Plugin Plugin
class class
Plugin
Loader
Plugin
Plugin Plugin
Plugin Plugin
Dependency Injection + Plugin Loader
Plugin Architecture Design Patterns
a) Traditional Extensible Software Architecture
b) Plugin-based Software Architecture
> Dependency Injection (DI)
> Dynamic Plugin Loader
> Combination of those
There’re trade-offs
> Choose the best solution for each project
Plugin Architecture

of Fluentd
What’s Fluentd?
> Data collector for unified logging layer
> Streaming data transfer based on JSON
> Written in C & Ruby
> Plugin Marketplace on RubyGems
> http://www.fluentd.org/plugins
> Working in production
> http://www.fluentd.org/testimonials
Deployment of Fluentd
Deployment of Fluentd
The problems around log collection…
Solution: N × M → N + M
plugins
# logs from a file
<source>
type tail
path /var/log/httpd.log
pos_file /tmp/pos_file
format apache2
tag web.access
</source>
# logs from client libraries
<source>
type forward
port 24224
</source>
# store logs to ES and HDFS
<match web.*>
type copy
<store>
type elasticsearch
logstash_format true
</store>
<store>
type s3
bucket s3-event-archive
</store>
</match>
<match metrics.*>
type nagios
host watch-server
</match>
Example: Simple forwarding
Example: HA & High performance
- HA (fail over)
- Load-balancing
- Choice of at-most-once or at-least-once
Example: Realtime search + Batch Analytics combo
All data
Hot data
Fluentd Core
Event

Router
Input
Plugin
Output
Plugin
Filter
Plugin
Buffer
Plugin
Output
Plugin
Input
Plugin
Plugin Architecture of Fluentd
Plugin
Loader
Fluentd Core
Event

Router
Input
Plugin
Output
Plugin
Filter
Plugin
Buffer
Plugin
Output
Plugin
Input
Plugin
Plugin Marketplace using RubyGems.org
$ gem install
fluent-plugin-s3
Plugin
Loader
/gems/
RubyGems.org
Plugin-based software design with Ruby and RubyGems
Fluentd’s Plugin Architecture
• Fluentd is a plugin-based event collector.
> Fluentd core: takes care of message routing
between plugins.
> Plugins: do all other things!
• 300+ plugins released on RubyGems.org
• Fluentd loads plugins using Gem API.
Plugin Architecture

of Embulk
Embulk:
Open-source Bulk Data Loader
written in Java & JRuby
Amazon S3
MySQL
FTP
CSV Files
Access Logs
Salesforce.com
Elasticsearch
Cassandra
Hive
Redis
Reliable
framework :-)
Parallel execution,
transaction, auto guess,
…and many by plugins.
Demo
Use case 1: Sync MySQL to Elasticsearch
embulk-input-mysql
embulk-filter-kuromoji
embulk-output-elasticsearch
MySQL
kuromoji
Elasticsearch
Use case 2: Load from S3 to Analytics
embulk-parser-csv
embulk-decoder-gzip
embulk-input-s3
csv.gz
on S3
Treasure Data
BigQuery
Redshift
+
+
embulk-output-td
embulk-output-bigquery
embulk-output-redshift
embulk-executor-mapreduce
Use case 3: Embulk as a Service at Treasure Data
Use case 3: Embulk as a Service at Treasure Data
REST API to load/export data
to/from Treasure Data
Input Output
Embulk’s Plugin Architecture
Embulk Core
Executor Plugin
Filter Filter
Guess
Output
Embulk’s Plugin Architecture
Embulk Core
Executor Plugin
Filter Filter
GuessFileInput
Parser
Decoder
Guess
Embulk’s Plugin Architecture
Embulk Core
FileInput
Executor Plugin
Parser
Decoder
FileOutput
Formatter
Encoder
Filter Filter
Embulk’s Plugin Architecture
Embulk Core
PluginManager
Executor Plugin
InjectedPluginSource
ParserPlugin
JRubyPluginLoader
FormatterPlugin
JRuby
Plugin
Loader
Plugin
FilterPlugin
OutputPluginInputPlugin
JRuby RuntimeJava Runtime
Plugin Marketplace using RubyGems.org
Embulk Core
PluginManager
Executor Plugin
InjectedPluginSource
ParserPlugin FormatterPluginFilterPlugin
OutputPluginInputPlugin
JRuby RuntimeJava Runtime
$ embulk gem install
embulk-input-oracle
/gems/
RubyGems.org
JRubyPluginLoader
JRuby
Plugin
Loader
Plugin
Plugin Package Structure
embulk-input-s3.gem
+- build.gradle
|
+- src/main/java/org/embulk/input/s3
| - S3FileInputPlugin.java
| AwsCredentials.java
|
+- classpath/
| - embulk-input-s3-0.2.6.jar
| aws-java-sdk-s3-1.10.33.jar
| httpclient-4.3.6.jar
|
+- lib/embulk/input/
- s3.rb
Java source
files
Compiled
jar file
All dependent

jar files
Ruby script to

load the jar files
Embulk Plugin Load Sequence
Bundler.setup_environment
Embulk::Runner = Embulk::Runner.new(
.embulk.EmbulkEmbed::Bootstrap.new.initialize)
Embulk::Runner.run(ARGV)
Java
JRuby
Java
org.embulk.cli.Main.main(String[] args) {
org.jruby.Main.main(
"embulk.jar!/embulk/command/embulk_bundle.rb",
args);
}
org.embulk.exec.BulkLoader.run(…)
org.embulk.plugin.PluginManager.newPlugin(…)
{
jruby = org.jruby.embed.ScriptingContainer()
rubyObj = jruby.runScriptlet("Embulk::Plugin")
jruby.callMethod(rubyObj, "new_java_input", "s3")
}
Embulk Plugin Load Sequence
def new_java_input(type)
rubyPluginClass = lookup(:input, type)
return rubyPluginClass.new_java
end
Java
JRuby
org.embulk.plugin.PluginManager.newPlugin(…)
Embulk Plugin Load Sequence
def new_java
jars = Dir["classpath/**/*.jar"]
factory = org.embulk.embulk.plugin.PluginClassLoaderFactory.new
classloader = factory.create(jars)
return classloader.loadClass("org.embulk.input.s3.S3InputPlugin")
end
Java
JRuby
PluginClassLoaderFactory.create(URL[] jarPaths) {
return new PluginClassLoader(jarPaths);
}
Embulk
• Embulk is a plugin-based parallel bulk data loader.
• Guess plugins suggest you what plugins are
necessary, and how to configure the plugins.
• Executor plugins run plugins in parallel.
• Embulk core takes care of message passing
between plugins.
• Embulk loads plugins using JRuby and Gem API.
./embulk.jar
$ ./embulk.jar guess example.yml
executable jar!
Header of embulk.jar
: <<BAT
@echo off
setlocal
set this=%~f0
set java_args=
rem ...
java %java_args% -jar %this% %args%
exit /b %ERRORLEVEL%
BAT
# ...
exec java $java_args -jar "$0" "$@"
exit 127
PK...
embulk.jar is a shell script
: <<BAT
@echo off
setlocal
set this=%~f0
set java_args=
rem ...
java %java_args% -jar %this% %args%
exit /b %ERRORLEVEL%
BAT
# ...
exec java $java_args -jar "$0" "$@"
exit 127
PK...
argument of “:” command (heredoc).
“:” is a command that does nothing.
#!/bin/sh is optional.
Empty first line means a shell script.
java -jar $0
shell script exits here
(following data is ignored)
embulk.jar is a bat file
: <<BAT
@echo off
setlocal
set this=%~f0
set java_args=
rem ...
java %java_args% -jar %this% %args%
exit /b %ERRORLEVEL%
BAT
# ...
exec java $java_args -jar "$0" "$@"
exit 127
PK...
.bat exits here
(following lines are ignored)
“:” means a comment-line
embulk.jar is a jar file
: <<BAT
@echo off
setlocal
set this=%~f0
set java_args=
rem ...
java %java_args% -jar %this% %args%
exit /b %ERRORLEVEL%
BAT
# ...
exec java $java_args -jar "$0" "$@"
exit 127
PK...
jar (zip) format ignores headers
(file entries are in footer)
Pitfalls & Challenges
Pitfalls & Challenges
• Plugin version conflicts
• Performance impact due to loosely-coupled API
Plugin Version Conflicts
Embulk Core
Java Runtime
aws-sdk.jar v1.9
embulk-input-s3.jar
Version conflicts!
aws-sdk.jar v1.10
embulk-output-redshift.jar
Multiple Classloaders in JVM
Embulk Core
Java Runtime
aws-sdk.jar v1.9
embulk-input-s3.jar
Isolated
environments
aws-sdk.jar v1.10
embulk-output-redshift.jar
Class Loader 1
Class Loader 2
Version conflicts in a JRuby Runtime
Embulk Core
Java Runtime
httpclient 2.5.0
embulk-input-sfdc.gem
Version conflicts!
httpclient v2.6.0
embulk-input-marketo.gem
JRuby Runtime
Java Runtime
Multiple JRuby Runtime?
Fluentd Core
activerecord ~> 3.4
fluentd-plugin-sql.gem
Isolated
environments?
activerecord ~> 4.2
fluent-plugin-presto.gem ?
Sub VM 1?
Sub VM 2?
Version conflicts in Fluentd
Fluentd Core
CRuby Runtime
activerecord ~> 3.4
fluentd-plugin-sql.gem
Version conflicts!
activerecord ~> 4.2
fluent-plugin-presto.gem ?
Challenges
• Version conflict is not completely solved.
• Java can use multiple ClassLoader
• I haven’t figured out hot to do the same thing in
Ruby
• I don’t have clear ideas to solve performance impact
• Write more code to learn?
Wrapping Up
“How did I build Plugin Architecture?”
• I built Fluentd using dynamic plugin loader.
• “Plugin calls Plugins”
• Most of features are provided by the ecosystem of plugins.
• I built Embulk using combination of:
• Dependency Injection,
• JRuby to implement a Dynamic Plugin Loader,
• Java VM and nested ClassLoaders to load multiple versions
of plugins.
• But some problems are not solved yet:
• Version conflicts in a Ruby VM.
• Design patterns of plugins AND high performance.
What’s Next?
• You build plugin-based software architecture!
• And you’ll talk to me how you did :-)
• I’m working on another project: a distributed
workflow engine
• Java VM + Python
Thank You!
Sadayuki Furuhashi

Founder & Software Architect

More Related Content

Plugin-based software design with Ruby and RubyGems

  • 1. Plugin-based software design with Ruby and RubyGems Sadayuki Furuhashi
 Founder & Software Architect RubyKaigi 2015
  • 2. A little about me… Sadayuki Furuhashi github: @frsyuki Fluentd - Unifid log collection infrastracture Embulk - Plugin-based parallel ETL Founder & Software Architect
  • 3. It's like JSON. but fast and small. A little about me…
  • 11. Benefits of Plugin Architecture > Plugins bring many features > Plugins keep core software simple > Plugins are easy to test > Plugins builds active developer community
  • 12. Benefits of Plugin Architecture > Plugins bring many features > Plugins keep core software simple > Plugins are easy to test > Plugins builds active developer community > “…if it’s designed well”.
  • 14. 
 plugin architecture? How did I design How to design
  • 15. Today’s topic > Plugin Architecture Design Patterns > Plugin Architecture of Fluentd > Plugin Architecture of Embulk > Pitfalls & Challenges
  • 17. Plugin Architecture Design Patterns a) Traditional Extensible Software Architecture b) Plugin-based Software Architecture
  • 18. Traditional Extensible Software Architecture Host Application Plugin Plugin Register plugins to extension points To add more extensibility, add more extension points.
  • 19. Plugin-based software architecture Core Plugin Plugin Plugin Plugin Plugin Plugin Plugin Application
  • 20. Plugin-based software architecture • Application as a network of plugins. > Plugins: provide features. > Core: framework to implement plugins. • More flexibility != More complexity. • Application must be designed as modularized. > It’s hard to design :( > Optimizing performance is difficult :( • Loosely-coupled API often makes performance worse.
  • 21. Design Pattern 1: Dependency Injection Core class interface class interface interface class class A component is an interface or
 a class. Each component publishes API:
  • 22. Design Pattern 1: Dependency Injection Core class Plugin Plugin Plugin Plugin class Plugin When application runs: A DI container
 replaces objects with plugins when application runs
  • 23. Replace classes with mocks for unit tests Design Pattern 1: Dependency Injection Core dummy dummy dummy dummy dummy Plugin dummy Testing the application
  • 24. Dependency Injection (Java) public interface Store { void store(String data); } public class Module { @Inject Module(Store store) { store.store(); } } public class DummyStore implements Store { void store(String data) { } } public class MainModule implements Module { public void configure( Binder binder) { binder.bind(Store.class) .to(DummyStore.class); } } interface → implementation
 mapping From source code, implementation is black box. It’s replaced at runtime.
  • 25. Dependency Injection (Ruby) Ruby? (What’s a good way to use DI in Ruby?) (Please tell me if you know)
  • 26. Dependency Injection (Ruby) class Module def initialize(store: DummyStore.new) store.store(”data”) end end class DummyStore def store(data) end end injector = Injector.new. bind(store: DBStore) object = injector.get(Module) class DBStore def initialize(db: DBM.new) @db = db end def store(data) @db.insert(data) end end injector = Injector.new. bind(store: DBStore). bind(db: SqliteDBImpl) object = injector.get(Module) I want to do this: Keyword arguments {:keyword => class} mapping
 at runtime
  • 27. Design Pattern 2: Dynamic Plugin Loader Core Plugin Plugin Calls Plugin loader to load plugins Plugin Loader
  • 28. Design Pattern 2: Dynamic Plugin Loader Core Plugin Plugin Plugins also call Plugin Loader. Plugins create an ecosystem. Plugin Loader Plugin Plugin
  • 29. Design Pattern 3: Combination Core class Plugin class Plugin Plugin class class Plugin Loader Plugin Plugin Plugin Plugin Plugin Dependency Injection + Plugin Loader
  • 30. Plugin Architecture Design Patterns a) Traditional Extensible Software Architecture b) Plugin-based Software Architecture > Dependency Injection (DI) > Dynamic Plugin Loader > Combination of those There’re trade-offs > Choose the best solution for each project
  • 32. What’s Fluentd? > Data collector for unified logging layer > Streaming data transfer based on JSON > Written in C & Ruby > Plugin Marketplace on RubyGems > http://www.fluentd.org/plugins > Working in production > http://www.fluentd.org/testimonials
  • 35. The problems around log collection…
  • 36. Solution: N × M → N + M plugins
  • 37. # logs from a file <source> type tail path /var/log/httpd.log pos_file /tmp/pos_file format apache2 tag web.access </source> # logs from client libraries <source> type forward port 24224 </source> # store logs to ES and HDFS <match web.*> type copy <store> type elasticsearch logstash_format true </store> <store> type s3 bucket s3-event-archive </store> </match> <match metrics.*> type nagios host watch-server </match>
  • 39. Example: HA & High performance - HA (fail over) - Load-balancing - Choice of at-most-once or at-least-once
  • 40. Example: Realtime search + Batch Analytics combo All data Hot data
  • 42. Fluentd Core Event
 Router Input Plugin Output Plugin Filter Plugin Buffer Plugin Output Plugin Input Plugin Plugin Marketplace using RubyGems.org $ gem install fluent-plugin-s3 Plugin Loader /gems/ RubyGems.org
  • 44. Fluentd’s Plugin Architecture • Fluentd is a plugin-based event collector. > Fluentd core: takes care of message routing between plugins. > Plugins: do all other things! • 300+ plugins released on RubyGems.org • Fluentd loads plugins using Gem API.
  • 46. Embulk: Open-source Bulk Data Loader written in Java & JRuby
  • 47. Amazon S3 MySQL FTP CSV Files Access Logs Salesforce.com Elasticsearch Cassandra Hive Redis Reliable framework :-) Parallel execution, transaction, auto guess, …and many by plugins.
  • 48. Demo
  • 49. Use case 1: Sync MySQL to Elasticsearch embulk-input-mysql embulk-filter-kuromoji embulk-output-elasticsearch MySQL kuromoji Elasticsearch
  • 50. Use case 2: Load from S3 to Analytics embulk-parser-csv embulk-decoder-gzip embulk-input-s3 csv.gz on S3 Treasure Data BigQuery Redshift + + embulk-output-td embulk-output-bigquery embulk-output-redshift embulk-executor-mapreduce
  • 51. Use case 3: Embulk as a Service at Treasure Data
  • 52. Use case 3: Embulk as a Service at Treasure Data REST API to load/export data to/from Treasure Data
  • 53. Input Output Embulk’s Plugin Architecture Embulk Core Executor Plugin Filter Filter Guess
  • 54. Output Embulk’s Plugin Architecture Embulk Core Executor Plugin Filter Filter GuessFileInput Parser Decoder
  • 55. Guess Embulk’s Plugin Architecture Embulk Core FileInput Executor Plugin Parser Decoder FileOutput Formatter Encoder Filter Filter
  • 56. Embulk’s Plugin Architecture Embulk Core PluginManager Executor Plugin InjectedPluginSource ParserPlugin JRubyPluginLoader FormatterPlugin JRuby Plugin Loader Plugin FilterPlugin OutputPluginInputPlugin JRuby RuntimeJava Runtime
  • 57. Plugin Marketplace using RubyGems.org Embulk Core PluginManager Executor Plugin InjectedPluginSource ParserPlugin FormatterPluginFilterPlugin OutputPluginInputPlugin JRuby RuntimeJava Runtime $ embulk gem install embulk-input-oracle /gems/ RubyGems.org JRubyPluginLoader JRuby Plugin Loader Plugin
  • 58. Plugin Package Structure embulk-input-s3.gem +- build.gradle | +- src/main/java/org/embulk/input/s3 | - S3FileInputPlugin.java | AwsCredentials.java | +- classpath/ | - embulk-input-s3-0.2.6.jar | aws-java-sdk-s3-1.10.33.jar | httpclient-4.3.6.jar | +- lib/embulk/input/ - s3.rb Java source files Compiled jar file All dependent
 jar files Ruby script to
 load the jar files
  • 59. Embulk Plugin Load Sequence Bundler.setup_environment Embulk::Runner = Embulk::Runner.new( .embulk.EmbulkEmbed::Bootstrap.new.initialize) Embulk::Runner.run(ARGV) Java JRuby Java org.embulk.cli.Main.main(String[] args) { org.jruby.Main.main( "embulk.jar!/embulk/command/embulk_bundle.rb", args); } org.embulk.exec.BulkLoader.run(…) org.embulk.plugin.PluginManager.newPlugin(…)
  • 60. { jruby = org.jruby.embed.ScriptingContainer() rubyObj = jruby.runScriptlet("Embulk::Plugin") jruby.callMethod(rubyObj, "new_java_input", "s3") } Embulk Plugin Load Sequence def new_java_input(type) rubyPluginClass = lookup(:input, type) return rubyPluginClass.new_java end Java JRuby org.embulk.plugin.PluginManager.newPlugin(…)
  • 61. Embulk Plugin Load Sequence def new_java jars = Dir["classpath/**/*.jar"] factory = org.embulk.embulk.plugin.PluginClassLoaderFactory.new classloader = factory.create(jars) return classloader.loadClass("org.embulk.input.s3.S3InputPlugin") end Java JRuby PluginClassLoaderFactory.create(URL[] jarPaths) { return new PluginClassLoader(jarPaths); }
  • 62. Embulk • Embulk is a plugin-based parallel bulk data loader. • Guess plugins suggest you what plugins are necessary, and how to configure the plugins. • Executor plugins run plugins in parallel. • Embulk core takes care of message passing between plugins. • Embulk loads plugins using JRuby and Gem API.
  • 63. ./embulk.jar $ ./embulk.jar guess example.yml executable jar!
  • 64. Header of embulk.jar : <<BAT @echo off setlocal set this=%~f0 set java_args= rem ... java %java_args% -jar %this% %args% exit /b %ERRORLEVEL% BAT # ... exec java $java_args -jar "$0" "$@" exit 127 PK...
  • 65. embulk.jar is a shell script : <<BAT @echo off setlocal set this=%~f0 set java_args= rem ... java %java_args% -jar %this% %args% exit /b %ERRORLEVEL% BAT # ... exec java $java_args -jar "$0" "$@" exit 127 PK... argument of “:” command (heredoc). “:” is a command that does nothing. #!/bin/sh is optional. Empty first line means a shell script. java -jar $0 shell script exits here (following data is ignored)
  • 66. embulk.jar is a bat file : <<BAT @echo off setlocal set this=%~f0 set java_args= rem ... java %java_args% -jar %this% %args% exit /b %ERRORLEVEL% BAT # ... exec java $java_args -jar "$0" "$@" exit 127 PK... .bat exits here (following lines are ignored) “:” means a comment-line
  • 67. embulk.jar is a jar file : <<BAT @echo off setlocal set this=%~f0 set java_args= rem ... java %java_args% -jar %this% %args% exit /b %ERRORLEVEL% BAT # ... exec java $java_args -jar "$0" "$@" exit 127 PK... jar (zip) format ignores headers (file entries are in footer)
  • 69. Pitfalls & Challenges • Plugin version conflicts • Performance impact due to loosely-coupled API
  • 70. Plugin Version Conflicts Embulk Core Java Runtime aws-sdk.jar v1.9 embulk-input-s3.jar Version conflicts! aws-sdk.jar v1.10 embulk-output-redshift.jar
  • 71. Multiple Classloaders in JVM Embulk Core Java Runtime aws-sdk.jar v1.9 embulk-input-s3.jar Isolated environments aws-sdk.jar v1.10 embulk-output-redshift.jar Class Loader 1 Class Loader 2
  • 72. Version conflicts in a JRuby Runtime Embulk Core Java Runtime httpclient 2.5.0 embulk-input-sfdc.gem Version conflicts! httpclient v2.6.0 embulk-input-marketo.gem JRuby Runtime
  • 73. Java Runtime Multiple JRuby Runtime? Fluentd Core activerecord ~> 3.4 fluentd-plugin-sql.gem Isolated environments? activerecord ~> 4.2 fluent-plugin-presto.gem ? Sub VM 1? Sub VM 2?
  • 74. Version conflicts in Fluentd Fluentd Core CRuby Runtime activerecord ~> 3.4 fluentd-plugin-sql.gem Version conflicts! activerecord ~> 4.2 fluent-plugin-presto.gem ?
  • 75. Challenges • Version conflict is not completely solved. • Java can use multiple ClassLoader • I haven’t figured out hot to do the same thing in Ruby • I don’t have clear ideas to solve performance impact • Write more code to learn?
  • 77. “How did I build Plugin Architecture?” • I built Fluentd using dynamic plugin loader. • “Plugin calls Plugins” • Most of features are provided by the ecosystem of plugins. • I built Embulk using combination of: • Dependency Injection, • JRuby to implement a Dynamic Plugin Loader, • Java VM and nested ClassLoaders to load multiple versions of plugins. • But some problems are not solved yet: • Version conflicts in a Ruby VM. • Design patterns of plugins AND high performance.
  • 78. What’s Next? • You build plugin-based software architecture! • And you’ll talk to me how you did :-) • I’m working on another project: a distributed workflow engine • Java VM + Python Thank You! Sadayuki Furuhashi
 Founder & Software Architect