Improving the searching/storing chat logs for WAARTAA, adding admin UI.
An overview of your proposal
Waartaa is a web IRC service built on MeteorJS. It has a great UI, and appealing user experience, and with mail notification feature, whenever someone tags you on any channel. I wish to add a central hub to store IRC logs, wherein user can search for logs for any channel on the selected IRC server, and view them, save them, thus improving the current central logging. Also, fixing memory leaks, so that waartaa is suited for mobile devices, and making an admin UI.
The need you believe it fulfills
Currently when the user logs in, he/she can see the past logs based on the connection speed. However, if he wishes to see the entire history of messages he has to wait and keep scrolling the infinite scroll. With a central hub for logs the users would have to simply download the log file, enabling fast query search, which would greatly enhance the user experience. Also this would make waartaa a complete web IRC client with a fast searchable library of logs.
Any relevant experience you have
I have been involved in web development for a year now, with technologies like django, web2py, and I also possess unique skills in JS,jQuery and CSS. MeteorJS being a JS web-framework, so I think working around won't be difficult.
How you intend to implement your proposal
The implementation of a central log library would involve:
1. Making efficient way to categorically retrieve the logs of the users currently on the various channels of waartaa, which would efficiently receive the messages for the channels and queue them for storing. Since we are going to use logstash for the managing our logs, and rabbitMQ is the broker used by logstash, hence the queue management is the least of our concerns. Also each and every log is well structured in a proper JSON format, thus there is no need to use grok to parse in unstructured data. We need to write a JSON config file for logstash,which could look somewhat like this.
input
{
file { path => "/var/log/irclogchannel1" type => "irclog" }
file { path => "/var/log/irclogchannel2" type => "irclog" }
.... .... .... ....
}
We assume that for every channel there is a separate file which needs to be managed via logstash.
2. Since logstash works in conjunction with elasticsearch, therefore we simply need to add this,to allow elasticsearch to work alongside logstash.
elasticsearch { embedded => true }
Also since elasticsearch is schema free, therefore we simply put in the JSON logs to it, and it makes it searchable.
3. Building the graphs by the real time data recieved from elasticsearch can be done by using kibana. It can also be used to show a piechart depciting the share of logs on waartaa servers. Also, realtime graphs showing logs/minute for each channel can be drawn to show searching/retrieving platform, using elastic with the option of doing topic specific search.
For making the UI for the search platform, we would be using bootstrap,jQuery, and ajax. We would be display the graphs obtained from kibana as messages per hour, total messages, most active user, etc. through graphs.
Also we can add configs to elasticsearch to enable the search support we want enabled. Say I wish to search for a text on only channels starting with "f", or all kde channels.
4. For making waartaa, mobile device compliant, we need to make it responsive. Currently waartaa, is responsive, and runs smoothly, however memory leaks can make it really slow, on mobile devices which are limited in memory size whereas PC can cope up with it, without any noticeable difference. Current memory leak regressions. We need to for memory leaks in popular browsers like chrome, firefox and Internet explorer, and fix it for smooth functioning in mobile devices.
5. Integrating an admin interface should be fairly easy as we could begin with houston(https://github.com/gterrono/houston), which provides us a django-admin like interface for handling the collections in the meteor app, and modify it according to our needs. It has a basic UI, so we need to do the styling part.
A rough timeline for your progress
:Community Bonding Period:----------
21 April - 24 April : I am going to re-discuss my implementation plans with my mentors, and the waartaa community incorporating in the suggestions received, and clarifying the design issues (if any).
25 April - 28 April : Assessing the changes that might be needed to gather the logs.
29 April - May 02 : Getting familiar with the logstash. Trying to implement a small scale logging to get a feel of the API methods.
May 03 - May 06 : Using elasticsearch, and doing basic analysis of logs using kibana, to get a feel of the working.
May 07 - May 14 : Plan a rough work scheme, for implementing the bot, and store and retrieve feature.
May 15 - May 18 : Re-discuss with the mentors about the feasibility of the work scheme, and incorporate changes suggested if any.
:Coding Period Begins:-------------
May 19 - May 22 : Implement the changes to the JS backend (if needed any) and make a system for gathering in real time logs.
May 23 - May 24 : Test the changes made locally, by fetching in logs from a list of channels on a server.
May 25 - May 28 : Automating the process of fetching the list of channels and then write the logic for detecting inactive channels.
May 29 - June 4 : Integrating the results fetched by the bot with logstash, storing them to allow easy retrieval via elasticsearch. June 5 - June 12 : Integrating a basic admin interface for waarta using meteor-accounts-admin-ui.
June 13 - June 23 : Deploying the code on the server and rigorously testing it, for performance and consistency issues.
June 24 - June 27 : Discussing with the mentors about the current development, and getting their reviews/comments on github/IRC/mailing list.
Mid term deliverables The IRC logs hub is capable of logging in the the logs of the channels to the log servers, via logstash. We also have a basic admin interface ready.
June 28 - July 4 : Designing and implementing the UI interface for the search platform. Keeping in mind the need to display graphs, and recent activity happening on the IRC servers.
July 5 - July 12 : Checking for sources of memory leaks, fixing them and testing them to integrate mobile support.
July 13 - July 20 : Rigorously testing waartaa on a mobile device with browsers like google chrome, after deploying. Fix errors if any.
July 21 - August 1 : Completing the admin interface at the same time checking for memory leaks.
August 2 - August 5 : Documenting the code, and styling the code to fit with the coding style.
August 1 - August 22 : Backup time for unforeseen delays.
End term deliverables A full fledged working central hub for storing and retrieving IRC logs from waartaa.
Post GSoC ----------> Maintain waartaa, and implement newer features. Become a part of the waartaa community, and invite more people to join in.
Any other details you feel we should consider
A few mockup UIs: 1. Landing Page 2. Search Page 3. Log Display/Download
Have you communicated with a potential mentor? If so, who?
I have interacted heavily with the potential mentors, Sayan and Ratnadeep, discussing with them, the possible ways of implementing the said features.