From Fedora Project Wiki

Java API changes checker

Contact Information

Why do you want to work with the Fedora Project?

1. I want to improve OS which I use every day. First of all, it will make my life easier.

2. I found a proposal that is very interesting to implement: Summer_coding_ideas_for_2012#Java_API.2FABI_changes_checker

3. I want to participate in development of a big real-world project. I didn't have such experience yet.

Do you have any past involvement with the Fedora project or with any another open source project as a contributor (if possible please add some references as well)?

1. No, this is my first time.

Did you participate with the past GSoC programs, if so which years, which organizations?

1. Last year I proposed an improvement to SDL([1]), but they didn't find mentor for me.

Will you continue contributing/ supporting the Fedora project after the GSoC 2012 program, if yes, which team(s)/area(s), you are interested with?

1. Of course I will continue improving project I'm done.

2. I'm interested in improving laptop support. I got a new laptop, and found that can't use some hardware under Linux.

3. I use Eclipse as my main IDE, so maybe I'll contribute to Fedora Eclipse project.

Why should we choose you over the other applicants?

1. I propose a project that I really want to implement, so I won't give up on my project after month or two.

2. I have experience in writing code in C++, Java, C#. Basically it is experience gained from courses in university. Part of my code is hosted on github: [2]

3. I can learn new things very fast, if these things are related to programming. So if it will be needed to use some library/tool unknown to me, I will be able to learn how to use it and finish project.

4. I have experience of participating in ACM ICPC and other coding competitions, so I know how to test and optimize code.

Proposal Description

Overview

Original idea: Summer_coding_ideas_for_2012#Java_API.2FABI_changes_checker

Libraries written in Java add, remove and modify their public interfaces from time to time. This is normal, but currently it is very hard to guess effect an update of library to new version will have on rest of the system. What is needed is a tool that would be able to tell us that "With update of package java-library to version 2.0, function X(b) has been removed. This function is used in package java-app"


I will create a tool that takes a set of .jar archives (and/or .class files) and tries to resolve dependencies between them like C/C++ linker do. Checking if new version of library doesn't break anything will be easy - just replace the library with new version and check dependencies again. Of course, if dependency check fails, this tool must output all information that possibly can help. And, of course, all dependency information must be generated automatically from .jars/.class files. As RPM is just a specially structured archive, it will be easy to read .jars/classes from it. Old dependency information must be kept somewhere, so error reports would contain not only message "Class/method xxx required by yyy not found", but also "Xxx was provided by zzz".

Additionally, I think another feature is useful: this tool will be able to automatically find all dependencies of a specific jar/package.

The need you believe it fulfills

Typically Java searches for a class/method at runtime when it is requested by currently executing code. If you launched a java application and it didn't report any errors at startup, you can't be sure that the application wouldn't crash because of unsatisfied dependencies later.

Proposed tool will be able to ensure that all dependencies can be resolved, without launching or even installing/unpacking java applications. It will enable easier and safer updates of java applications/libraries, and will make java package maintainers' life easier.

Any relevant experience you have

I have much experience of writing code in Java, because in my university java is used for more than half of all programming courses. Also, I am familiar with packaging system. I didn't create a new package, but at least I am able to add a patch or rebuild kernel package with my own configuration.

How do you intend to implement your proposal

I will write a class library and a set of command-line utilities that will:

1. Parse java .class files and generate lists of methods defined and lists of methods and classes referenced by the code inside the class. Parsing will be done using some third-party library, possibly it will be Apache Commons BCEL.

Class files from .jar archives can be easily read using java's class library, .rpm's can be opened using jRPM.

2. Merge lists generated on previous step, and produce a combined list of defined methods/classes and still required methods/classes (unresolved dependencies).

This can be used to generate lists of defined/required methods and classes for .jar file or even .rpm package.

Also, API changes can be viewed using diff on lists generated by different versions of library/package.

3. Record defined classes/methods in database, with package/library name and version, and provide search by class name/method signature on this database.

To check a set of .jars/.rpms, lists of classes/methods should be generated for every class inside them and all these lists should be merged. After that, every element of list of still required classes/methods can be searched in database.

Such set of utilities won't be restricted only to checking if something is broken with update. For example, it will be able to automatically generate list of dependencies. It will be even possible to generate .spec files from output of these utilities.

I think it will be useful to implement the command-line utilities as Ant tasks also.

Maybe it seems that I try to implement too many features, but I think that Apache BCEL will make development significantly easier, and full source code will not exceed 3000-4000 lines.

Final deliverable of the proposal at the end of the period

  • Command-line Java utilities and set of Ant tasks.
  • RPM packages of these utilities.
  • Possibly separate packages for dependencies.

A rough timeline for your progress

  • April, 24 - May, 15 - Discuss additional details about project implementation and future usage. I should have done this before, but I've seen this project idea too late.
  • May, 15 - June, 1 - Implement utility that generates lists of defined and required methods/classes for a set of .class files. It seems that using Apache BCEL I will have to write just few lines of code.
  • June, 1 - July, 1 - Add support for .jar's and .rpm packages. So I will have a completely usable part of project before mid-term evaluations, at least it will be able to show differences in API between library versions. Also write Ant tasks.
  • July, 1 - August, 1 - Implement a database for storing information about defined classes/methods. Add support for searching in it to the code that is written before.
  • August, 1 - August, 13 - Make rpm packages for applications/libraries developed and, possibly, dependencies that isn't in Fedora repositories yet.

I have exams in June, so most of work will be done in July.

Have you communicated with a potential mentor? If so, who?