Pentaho (Kettle for ETL)

Data Integration (or Kettle) delivers powerful Extraction, Transformation, and Loading (ETL) capabilities, using a groundbreaking, metadata-driven approach.

The idea of this meetup is to introduce the pentaho suite. As it has many component only the kettle tool used for ETL (Extraction, Transform and Load) will be presented.

If the session is successful other related components can be presented in coming meetups:

• Pentaho BI Server

• Report Designer

• Weka (For data mining)

• Pentaho Dashboards

The following points can be covered (depending interest):

1.- Pentaho Kettle installation manually and automated (using puppet)

2.- Basic examples using different connector databases, json, csv, etc.

3.- Task scheduler using Cron and Pentaho BI

4.- Load balancing using carte

Please send your suggestions and  comments in order to allow me to prepare an interesting and productive meetup.


1.- Download Pentaho tools from Pentaho’s community webpage

2.- The tools to be used are:

 -. Data Integrator (aka as Kettle) 

 -. Report Designer

 -. Business Analytics Platform

3.- Unzip both tools on a known path. eg: /home/user/pentaho/[DataIntegrator|ReportDesigner|Business Analytics] 

4.- Ensure you have an updated Java 8 version installed on you PC

5.- Check that you can run both tools.

-. Data Integrator’s executable is spoon

-. Report Designer’s executable is report-designer

6.- Clone the Git repository with all the files needed:

git clone

See you there!!


<a href=”,-0.421834,15z/data=!4m2!3m1!1s0x0:0xf33d4e2e3ecd7cdf”>


<a href=”,-0.421834,15z/data=!4m2!3m1!1s0x0:0xf33d4e2e3ecd7cdf”>


<a href=”,-0.421834,15z/data=!4m2!3m1!1s0x0:0xf33d4e2e3ecd7cdf”>