From Fedora Project Wiki
m (internal link cleaning)
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
= Watt hunting competition =
= Watt hunting competition =
[[File:Whcompetition.jpg|600x400px]]
This competition was part of [[DeveloperConference2009|DevCon09]] event. It takes place in lab - where was 10 computers setting up for the competitors. And the aim was to enjoy programming a bit and look to methods how to optimize the code. And compete with the others.


= Winners: =
= Winners: =
Line 15: Line 18:
The input data were quite large to emphasize the differences between algorithms and implementation changes.
The input data were quite large to emphasize the differences between algorithms and implementation changes.


Then we compare the power consumption of all implementations (there were a penalisation for not implementing a task 20ms).
Then we compare the power consumption of all implementations (there were a penalisation for not implementing a task 20s).


The differences between the disc accesses were not to high in our case so we omit to measure them.
The differences between the disc accesses were not to high in our case so we omit to measure them.
Line 54: Line 57:
$ ./third text.txt
$ ./third text.txt
2
2
</pre>  
</pre>
 


== Solution: ==
= Solution: =
1/ Because the test input number was high there were only slight differences between bash and c implementation.
1/ Because the test input number was high there were only slight differences between bash and c implementation.
The winning solution use formula for the computing of the sum ([[File:02first.c]]).
The winning solution use formula for the computing of the sum ([[File:02first.c]]).
Line 66: Line 68:


3/ In c problem could use several methods either go through whole file and the fastest was using fscanf ([[File:03third.c]]), better version use some mechanism to find the occurrence of f character at first and then test whether it is part of file word (or some similar mechanism) ([[File:01third.c]]). The best solution uses mmap and order the tests based on the percent occurrence of character in text ([[File:02third.c]]).  
3/ In c problem could use several methods either go through whole file and the fastest was using fscanf ([[File:03third.c]]), better version use some mechanism to find the occurrence of f character at first and then test whether it is part of file word (or some similar mechanism) ([[File:01third.c]]). The best solution uses mmap and order the tests based on the percent occurrence of character in text ([[File:02third.c]]).  
In bash the winning strategy parse the output using grep command, and then  use tr to find the proper number sed could be used too but it was slower again.
Here could be used Boyer-Moore algorithm - but it is problem to have it in time (nobody use it), and the string is too short so the results are not so good.


In all cases there is fine to use come optimisation flags.
Boyer-Moore algorithm would be also good for this task- but it is too complicated to write in time (nobody used it), Also the word was too short, so the results wouldn't be fast enough in comparison to bash.
 
In bash the winning strategy is to split the input into one-word-per-line form using the tr command, and count the matching lines using grep -c.  Sed could be used too but it was more complex to write and slower.
 
 
In all cases there is fine to use come optimization flags.
There are another methods how to speed up the solution (use fgetc_unlock and so on), nobody use it this competition (but feel free to try to test it :) ).
There are another methods how to speed up the solution (use fgetc_unlock and so on), nobody use it this competition (but feel free to try to test it :) ).

Latest revision as of 21:34, 17 September 2016

Watt hunting competition

This competition was part of DevCon09 event. It takes place in lab - where was 10 computers setting up for the competitors. And the aim was to enjoy programming a bit and look to methods how to optimize the code. And compete with the others.

Winners:

1. Dan Mach
2. Pavel Moravec
3. Tomas Janousek

(from 24 competitors)

Problem description:

We have three tasks - which have to be implement in 45 minutes.

Then we measure only the time of the run of program on test data set (we want to measure disk accesses too, but scomes tool was not stable enough and that there were only small differences in disk accesses in the solution we use for tests in comparison with the time differences). The input data were quite large to emphasize the differences between algorithms and implementation changes.

Then we compare the power consumption of all implementations (there were a penalisation for not implementing a task 20s).

The differences between the disc accesses were not to high in our case so we omit to measure them.


Tasks are:

1/ Implement a program which input is a positive number less then 10 000 (we will denote the number n) and the output is 1+2+3+...+(n-1) The return code have to be 0.

Example:

 	$ ./first 5
	10

2/ Implement a program, which input is a file name and this program put the file on output - but omit all white spaces (' ', '\t', '\n'). The return code have to be 0.

Example:

 	$ cat jmeno.txt
	. . . .
	..
	Ahoj svete
	... ...
	$ ./second jmeno.txt
	......Ahojsvete......

3/ Implement a program which input is file name and which return the number of occurrences of string "file" in input file. The string have to be separated with white space (' ', '\t', '\n') on the beginning and at the end. The return code have to be 0.

Example:

 	$ cat text.txt
	123 file files hula
	file pes
	pfile les
	$ ./third text.txt
	2

Solution:

1/ Because the test input number was high there were only slight differences between bash and c implementation. The winning solution use formula for the computing of the sum (File:02first.c).

2/ In c there is several methods to implement this tasks the slower was fscanf (File:02second.c), faster variants use getc (File:01second.c) and the fastest one uses fread (File:03second.c). Bash version which uses sed parser is rather slow, but the one using tr command is really fast.

3/ In c problem could use several methods either go through whole file and the fastest was using fscanf (File:03third.c), better version use some mechanism to find the occurrence of f character at first and then test whether it is part of file word (or some similar mechanism) (File:01third.c). The best solution uses mmap and order the tests based on the percent occurrence of character in text (File:02third.c).

Boyer-Moore algorithm would be also good for this task- but it is too complicated to write in time (nobody used it), Also the word was too short, so the results wouldn't be fast enough in comparison to bash.

In bash the winning strategy is to split the input into one-word-per-line form using the tr command, and count the matching lines using grep -c. Sed could be used too but it was more complex to write and slower.


In all cases there is fine to use come optimization flags. There are another methods how to speed up the solution (use fgetc_unlock and so on), nobody use it this competition (but feel free to try to test it :) ).