From Fedora Project Wiki
No edit summary
No edit summary
Line 1: Line 1:
=GlusterFS-iostat=
'''GlusterFS-iostat
'''


 
=Project overview=
 
==Project overview==
I propose to build a tool called glusterfsiostat for GlusterFS which will be having the same functionality as nfsiostat, written by Sebastian Godard but will work mainly for Gluster mounts. glusterfsiostat will be used to provide performance information of glusterfs mounts on a system through a standard CLI and visualization integration of data with graphics processing utility.
I propose to build a tool called glusterfsiostat for GlusterFS which will be having the same functionality as nfsiostat, written by Sebastian Godard but will work mainly for Gluster mounts. glusterfsiostat will be used to provide performance information of glusterfs mounts on a system through a standard CLI and visualization integration of data with graphics processing utility.


==The need you believe it fulfills==
=The need you believe it fulfills=
GlusterFS collects I/O statistics with the io-stats translator. These stats are collected on both the GlusterFS servers and clients. But, currently only the server stats can be easily obtained and displayed via the 'gluster volume top/profile' commands.There is a need for an analogous mechanism for obtaining and displaying these stats on the clients as well. Having these stats available will help in better profiling and understanding of the GlusterFS client performance. Hence the throughput and performance of the system under varied conditions can be obtained and further used for more effective visualization purposes.   
GlusterFS collects I/O statistics with the io-stats translator. These stats are collected on both the GlusterFS servers and clients. But, currently only the server stats can be easily obtained and displayed via the 'gluster volume top/profile' commands.There is a need for an analogous mechanism for obtaining and displaying these stats on the clients as well. Having these stats available will help in better profiling and understanding of the GlusterFS client performance. Hence the throughput and performance of the system under varied conditions can be obtained and further used for more effective visualization purposes.   


==Any relevant experience you have==
=Any relevant experience you have=
GlusterFS is majorly implemented in C. I have past successful experience in working with and contributing in C(GSOC 2013). I've been head of the Web development team at CSI-JMI(a student run body) and have in my bag successful experiences at weekend long hackathons during the two and a half years of my college life. My experience with building web applications and online games will help me in visualizing the data from glusterfsiostat with tools like Kibana and logstash. Also, I've spent a major portion of my time few months back understanding Gluster xlators and dissecting the io-xlator code in particular with the help of Code spelunking applications like cscope.  
GlusterFS is majorly implemented in C. I have past successful experience in working with and contributing in C(GSOC 2013). I've been head of the Web development team at CSI-JMI(a student run body) and have in my bag successful experiences at weekend long hackathons during the two and a half years of my college life. My experience with building web applications and online games will help me in visualizing the data from glusterfsiostat with tools like Kibana and logstash. Also, I've spent a major portion of my time few months back understanding Gluster xlators and dissecting the io-xlator code in particular with the help of Code spelunking applications like cscope.  


Line 15: Line 14:
The current design of the io-stats translator facilitates logging of events by dumping data into a log file. This method makes it difficult and cumbersome for the client, as he/she has to parse the log file in order to get any statistics. The implementation of glusterfsiostat will consist majorly of the following 3 components :
The current design of the io-stats translator facilitates logging of events by dumping data into a log file. This method makes it difficult and cumbersome for the client, as he/she has to parse the log file in order to get any statistics. The implementation of glusterfsiostat will consist majorly of the following 3 components :


1. A communication channel will be implemented with the io-stats translator. There is currently no way to directly query the io-stats translator on clients. This channel will facilitate the need of exposing the states of the built in counters being already used. For this purpose, the information that is required to be exposed from the counters need to be identified. Accordingly, for every data being fetched from the counters, an interface(a function) will be written to regulate it's access. For post processing of data, a logging facility will be used which will keep storing data in it at regular intervals. This channel could be implemented in a similar manner to what is done by the 'gluster profile' command on the servers.
* A communication channel will be implemented with the io-stats translator. There is currently no way to directly query the io-stats translator on clients. This channel will facilitate the need of exposing the states of the built in counters being already used. For this purpose, the information that is required to be exposed from the counters need to be identified. Accordingly, for every data being fetched from the counters, an interface(a function) will be written to regulate it's access. For post processing of data, a logging facility will be used which will keep storing data in it at regular intervals. This channel could be implemented in a similar manner to what is done by the 'gluster profile' command on the servers.


2. A command line interface tool will be built to obtain the stats from io-stats and provide it in a suitable format which will include giving users an option to display data about any gluster mount accessible on the system. This  will be similar to the nfsiostat tool which prints out total read/write data in KB and the speed of the read/write operation on the devices. The implementation of the 'gluster volume top/profile' commands could help in understanding the various stats given by io-stats.
* A command line interface tool will be built to obtain the stats from io-stats and provide it in a suitable format which will include giving users an option to display data about any gluster mount accessible on the system. This  will be similar to the nfsiostat tool which prints out total read/write data in KB and the speed of the read/write operation on the devices. The implementation of the 'gluster volume top/profile' commands could help in understanding the various stats given by io-stats.


3. For visualization utilities and building graphs/charts, rrdtool written by Tobias Oetiker will be used. This would require adding an optional compile/build time dependency of rrdtool in the Gluster build process. If rrdtool is not installed in the build machine, Gluster will not compile io-stats to use the rrdtool for saving statistics.
* For visualization utilities and building graphs/charts, rrdtool written by Tobias Oetiker will be used. This would require adding an optional compile/build time dependency of rrdtool in the Gluster build process. If rrdtool is not installed in the build machine, Gluster will not compile io-stats to use the rrdtool for saving statistics.


Gluster checks for compile time dependency with the help of autoconf macro : AC_ARG_ENABLE(). This checks whether the user at the time of configuring has given any parameter that we require(like --enable-rrdtool). On getting this parameter, we have to ensure whether any executable named rrdtool is accessible on the user's PATH. The macro AC_CHECK_PROG will be used for that. If the tool exists and is accessible we go ahead, else the configure script will throw up an error.
Gluster checks for compile time dependency with the help of autoconf macro : AC_ARG_ENABLE(). This checks whether the user at the time of configuring has given any parameter that we require(like --enable-rrdtool). On getting this parameter, we have to ensure whether any executable named rrdtool is accessible on the user's PATH. The macro AC_CHECK_PROG will be used for that. If the tool exists and is accessible we go ahead, else the configure script will throw up an error.
Line 26: Line 25:




==A rough timeline for your progress==
=A rough timeline for your progress=


{|class="wikitable"
{|class="wikitable"
Line 58: Line 57:




==Any other details you feel we should consider==
=Any other details you feel we should consider=




==Have you communicated with a potential mentor? If so, who?==
=Have you communicated with a potential mentor? If so, who?=
Yes, I've had regular email correspondence with Krishnan Parthasarathi from Red Hat, who's actively involved with the Gluster Community. I've also had encouragement from Vijay Bellur and John Mark Walker for contributing to this project.
Yes, I've had regular email correspondence with Krishnan Parthasarathi from Red Hat, who's actively involved with the Gluster Community. I've also had encouragement from Vijay Bellur and John Mark Walker for contributing to this project.

Revision as of 15:03, 21 March 2014

GlusterFS-iostat

Project overview

I propose to build a tool called glusterfsiostat for GlusterFS which will be having the same functionality as nfsiostat, written by Sebastian Godard but will work mainly for Gluster mounts. glusterfsiostat will be used to provide performance information of glusterfs mounts on a system through a standard CLI and visualization integration of data with graphics processing utility.

The need you believe it fulfills

GlusterFS collects I/O statistics with the io-stats translator. These stats are collected on both the GlusterFS servers and clients. But, currently only the server stats can be easily obtained and displayed via the 'gluster volume top/profile' commands.There is a need for an analogous mechanism for obtaining and displaying these stats on the clients as well. Having these stats available will help in better profiling and understanding of the GlusterFS client performance. Hence the throughput and performance of the system under varied conditions can be obtained and further used for more effective visualization purposes.

Any relevant experience you have

GlusterFS is majorly implemented in C. I have past successful experience in working with and contributing in C(GSOC 2013). I've been head of the Web development team at CSI-JMI(a student run body) and have in my bag successful experiences at weekend long hackathons during the two and a half years of my college life. My experience with building web applications and online games will help me in visualizing the data from glusterfsiostat with tools like Kibana and logstash. Also, I've spent a major portion of my time few months back understanding Gluster xlators and dissecting the io-xlator code in particular with the help of Code spelunking applications like cscope.

How you intend to implement your proposal

The current design of the io-stats translator facilitates logging of events by dumping data into a log file. This method makes it difficult and cumbersome for the client, as he/she has to parse the log file in order to get any statistics. The implementation of glusterfsiostat will consist majorly of the following 3 components :

  • A communication channel will be implemented with the io-stats translator. There is currently no way to directly query the io-stats translator on clients. This channel will facilitate the need of exposing the states of the built in counters being already used. For this purpose, the information that is required to be exposed from the counters need to be identified. Accordingly, for every data being fetched from the counters, an interface(a function) will be written to regulate it's access. For post processing of data, a logging facility will be used which will keep storing data in it at regular intervals. This channel could be implemented in a similar manner to what is done by the 'gluster profile' command on the servers.
  • A command line interface tool will be built to obtain the stats from io-stats and provide it in a suitable format which will include giving users an option to display data about any gluster mount accessible on the system. This will be similar to the nfsiostat tool which prints out total read/write data in KB and the speed of the read/write operation on the devices. The implementation of the 'gluster volume top/profile' commands could help in understanding the various stats given by io-stats.
  • For visualization utilities and building graphs/charts, rrdtool written by Tobias Oetiker will be used. This would require adding an optional compile/build time dependency of rrdtool in the Gluster build process. If rrdtool is not installed in the build machine, Gluster will not compile io-stats to use the rrdtool for saving statistics.

Gluster checks for compile time dependency with the help of autoconf macro : AC_ARG_ENABLE(). This checks whether the user at the time of configuring has given any parameter that we require(like --enable-rrdtool). On getting this parameter, we have to ensure whether any executable named rrdtool is accessible on the user's PATH. The macro AC_CHECK_PROG will be used for that. If the tool exists and is accessible we go ahead, else the configure script will throw up an error.

Since rrd code can't be isolated from the existing io-stats code in a separate file/directory(then it would've been just a question of compiling or not compiling the file). So, the solution for compiling rrd related code only when the user requires it, is through maintaining certain MACROS in the io-stats file, which need to be passed to it while compilation by appending -DFLAG to CFLAGS in the Makefile through autotools. Alternatively flags can also be passed globally to Gluster during compilation by inserting and defining it into confdefs.h


A rough timeline for your progress

Duration Task
May 19, 2014 -- May 31, 2014 Interact with the Gluster community inorder to standardize what counters and stats will the tool provide.
June 1, 2014 -- June 15, 2014 Develop the interfaces which will expose the io-stats counter
June 16, 2014 -- June 27, 2014 Implement the CLI
June 28, 2014 -- July 10, 2014 Add optional compile time dependency of the visualization tool
July 11, 2014 -- August 11, 2014 Add functionality to store io-stats data in a logging facility for future retrieval
August 12, 2014 -- August 20, 2014 Scrub code, Write documentation
Final Evaluation Final Evaluation


Any other details you feel we should consider

Have you communicated with a potential mentor? If so, who?

Yes, I've had regular email correspondence with Krishnan Parthasarathi from Red Hat, who's actively involved with the Gluster Community. I've also had encouragement from Vijay Bellur and John Mark Walker for contributing to this project.