Team Reports 2015
Team Leaders are encouraged to present a report for their team. (alphabetic order)
In FY 2014/2015 the audit team has undertaken the following actions:
An Audit Plan for 2015 has been created
Four audits from the Audit Plan have been started, one was aborted, one finished, two are still running
- 4 Incidents have been handled
The Co-Audit Team got a new way to conduct audits and was renamed to RA-Audit
Incident Types in FY 2014 / 2015
Data Privacy Breach
To build a strong audit capability, further internal Auditors are welcome to participate in the Audit Team.
CAcert Lead Auditor
Critical System Administrator Team Report July 2014 - June 2015
After the many hardware changes made in the previous reporting year, no further changes were necessary in the past year. There were no failing components, and only one site visit was performed.
The log of visits to the hosting facility shows the following "on site" activities:
- [12.11.2014] signer maintenance and firewall OS upgrade
Actually, the firewall OS upgrade was done from outside the server room, and might as well have been performed from home, but since it was the first time we did this, we decided to limit the risks by doing it this way.
The total number of visits (1) was much lower than the number in previous years (7 in 2013/2014, and also 7 in 2012/2013), which shows the relative stability of the critical systems.
The equipment is getting older and older though, so we shouldn't be surprised to see more failures / visits in the future due to hardware problems.
All other (i.e. most!) system administration work has been performed remotely. Issues directly affecting the operation of the webdb server continue to be logged to the firstname.lastname@example.org mailing list (archived at https://lists.cacert.org/wws/arc/cacert-systemlog ) with headings like "configuration change webdb server", "security upgrades webdb server" or "cvs.cacert.org checkin notification". This logging is also used for changes to all other services like DNS, OCSP etc. under critical-admin management. A total of 187 messages were posted on this mailing list during the year.
While the upgrade of the main server environment from Debian "Squeeze" to "Wheezy" could be performed in the previous reporting year, the upgrade of the chroot environment in which the CAcert application code is running to the same OS level, was delayed until August 28, 2014. This was due to some issues with the CAcert application code which had to be resolved by the Software Assessment team first.
In the mean time, it is time again for another OS upgrade, as Debian "Wheezy" has gone to "oldstable" status, while "Jessy" is the current Debian stable release. We expect to be able to do this upgrade later this year.
Other maintenance work on the webdb server during the reporting period involved:
- 54 installations of one or more Debian security updates
- 4 configuration changes
- 30 application software patch installations
thus making a total of 88 critical admin interventions for this server (previous year: 97).
The DNS service has been continued in more or less the same configuration as the previous year. One noteworthy change was the addition of a TLSA record for www.cacert.org and secure.cacert.org. This supports effective use of the DNSSEC/TLSA Validator browser plugin available from CZ.NIC Labs.
Maintenance activities for this server boiled down to:
- 2 DNS software version update
- 2 configuration changes
- 2 installations of one or more OpenSuSE security updates
- 1 Key Signing Key rollover (for each of 3 zones)
- 12 zone file changes
thus making a total of 19 critical admin interventions for this server (previous year: 46).
The OCSP service has been continued in more or less the same configuration as the previous year.
Maintenance activities for the OCSP service boiled down to:
- 1 reboot after system failure
- 3 configuration changes
- 3 installations of one or more OpenSuSE security updates
thus making a total of 6 critical admin interventions for this server (previous year: 7).
The CRL service has been continued in more or less the same fashion as the previous year.
Maintenance activities for the CRL service boiled down to:
- 1 configuration change
- 3 installations of one or more OpenSuSE security updates
this making a total of 4 critical admin intervention for this server.
Last year we reported that the availability of the CRL service had been stabilized during that year. While this is still true, we want to repeat the driving negative factors here:
- the CRLs are growing larger and larger as more certificates are revoked, but all revocations are kept on the CRLs, including those for certificates which have expired;
- the number of consumers for these CRLs is increasing, in particular a number of consumers which attempt to retrieve the CRLs at a much higher frequency than really sensible (once per week should be OK for most purposes);
Note that we are routinely pushing out over 150 GB of data *per day* from just this server. This is an artificial limit, imposed by the the bandwidth that has been donated to CAcert by BIT. Without this bandwidth limit, the amount of data pushed out would grow even worse. While our server can handle this, CAcert does not have the finances to pay for such an extravagant usage of bandwidth.
The most fundamental method of attacking the problem is still open though, as it entails some fundamental changes in the operation of the CAcert signing server:
- reduce the size of the CRLs by excluding expired certificates.
The boxbackup server has also been continued unchanged, with maintenance activities consisting of a number of smaller interventions:
- 53 installations of one ore more Debian security updates
thus making a total of 53 critical admin interventions for this server (previous year: 22).
The new dual firewall has performed flawlessly during the reporting year. One major software upgrade was performed in November 2014, by upgrading the OS on both firewalls to OpenBSD 5.6. Thanks to the dual firewall design, this upgrade could be performed with zero downtime for CAcert users and services.
Maintenance of the firewall has boiled down to three key components:
- the pf ruleset - 7 changes
- the relayd configuration (for OCSP and CRL) - 4 change
- the unbound configuration (for internal DNS service) - 6 changes
thus making a total of 17 critical admin interventions besides the OS upgrade for the new firewall
Our primary external monitoring remained based on the use of a private server of a critical team member, with the limitations implied by that. A new offer made for a well-connected VM has not materialized yet.
Some support has been given to the infrastructure team for enabling IPv6 support.
Software Assessment Team support
We continued to support the Software Assessment Team by maintaining a test server (on a virtual machine) which looks as closely as possible to the production webdb server. A second similar test server is also maintained for special critical system tests and preparation of major software upgrades. This second test server is normally the first one to be upgraded when new OS upgrades or security patches are to be installed.
The patch process developed by the Software Assessment Team has resulted again in a significant number (30) of successful patch updates to the production server (previous year: 58).
Events team support
From time to time the events team wants to inform CAcert members about important events like Assurer Training Events and the like. These mailings are performed by adding a custom script to the webdb server and running it against the current database. Based on arbitration http://wiki.cacert.org/Arbitrations/a20090525.1, such scripts are prepared by the events team and handed over to the critical admin team for installation and execution. 5 cases were handled in the past year.
One huge mailing was also executed by the critical admin team, for informing the CAcert membership about the CCA policy changes:
The mailing script has been running from Sep 24, 09:45 until Sep 25, 06:07 CEST. A total of 237187 messages has been sent out 290146 entries.
According to the postfix mail statistics, a total of 239689 e-mails were sent during this period (including regular webdb service mails). For 27642 e-mails out of these delivery problems were reported.
At this moment (Sep 25, 09:15 CEST) there are still some 4400 e-mails queued for possible delivery later (the regular queue size is more like 50 - 100 e-mails).
Some smaller mailings were performed as wll with regard to organizational assurers and an incorrect CCA being displayed on the website for some time.
Interaction with other teams
From time to time the critical admin team also receives requests from other CAcert teams like Support and Arbitration, which we try to handle as quickly as possible. The total number of e-mails processed or generated by the critical admin team during the reporting year amounts to around 1000.
There were no team changes in the past year.
Plans for the coming year include:
- prepare system software upgrades (Debian Jessy, OpenSuSE 13.2)
- move critical services from sun4 to sun3, possibly retain sun4 for redundancy
- improve availability of OCSP and CRL services
- implement performance monitoring for the new firewall
- improve external system monitoring
- expand and improve server documentation
- look for strengthening of the sysadmin team
Wytze van der Raay, Mendel Mobach, Martin Simons
Critical System Administrator Team
Management of CATS and the Assurer Challenge
The past year also saw a very stable and essentially unchanged productive CATS. At least the french language for the user interface has now been activated, though there's still some discussion going on about improvements in the corresponding bugtracker case.
On the development system the user interface and the Assurer Challenge have been translated (at least to a good percentage) to czech language. The user interface translation has still to be activated (one of the things quite on top of my ToDo list), and user interface, as well as Assurer Challenge need reviews by others before they can be transferred to the production system.
Translating the Assurer Challenge to czech required some changes in the software, since czech language uses quite some characters not covered by the ISO-8859-1 character set, which currently used by CATS. The quick solution was to use HTML encoded unicode to store the questionaire data. This was easy to implement and should allow comperatively easy migration, once software and database are ready to handle unicode natively.
Since early summer this year, issuing of CATS certificates has grinded to halt, because the very limited amount of manpower in support currently is needed for more important jobs than verifying Assurer status of the applicants. An interface to the main database that allows verification of Assurer status is (AFAIK) still not implemented.
Several new tests, variants of a "Data Privacy Quiz", have been created on the testserver, but still contain only a handful of questions.
The usual statistics, from July 2014 to June 2015:
- 2349 test have been made, 1196 english Assurer Challenges, 1053 german ones and 140 Triage Challenges
- 1122 of the Assurer Challenges has at least 80% correct answers and are therefor counted as passed
- 793 different users (that is, different certificates used to login) have passed the test at least once
- 263 users tried the test at least once but don't have a successful test recorded
- On the average those who passed the test had about one (more exactly: 0.93, compared to last year's 0.91) unseccessful tries before passing.
So, the number of tests (started and passed) dropped by about 30% compared to last year, but are stil above the level of 2013. Probably the main reason is the lower number of new Assurers which can be seen in the CAcert statistics.
The number of users who tried at least once but never passed the Challenge has increased (in absolute numbers!), but is still small compared to the users that passed. This needs some monitoring, but I won't consider it as alarming yet.
Plans for next year
- Extend CATS to better support reviewing of questionaires (translated and new answers)
- Get some of the new questions on the testsystem reviewed and installed on the production system
- The usual things, new materials for ATEs, support event organisation, find some new people doing work...
Early this year the Infrastructure Team lead changed from Mario Lipinski to Jan Dittberner, Board approved this change in January. Thanks to Mario for all his work in the past.
Jan did a major rework of the documentation in the Wiki in February and wrote an announcement to the system administrators lists about this work in progress. During the work on this task it became obvious that many systems are in an outdated state and lacking proper documentation. This has not changed very much yet.
Unfortunatelly Jan did not find the time to continue the tasks announced in February, namely replacing the old email and webmail containers, sending a infrastructure team census and enabling DKIM/DMARC.
The team would need more people with enough time to maintain all systems in a professional way.
New Root & Escrow Project (NRE)
Organisation Assurance Team
For your information:
During the second part of 2014 all policies were set to policy status. During the remaining part of the business year, multiple policies were discussed, including PoP changes, a privacy web policy, appeals and relations of officer roles. Non of them reached a status where a policy was changed.
For your information:
Alexander Bahlo, Officer for Public Relations, was recalled by board on January, 11th, 2015 with motion m20150111.3. Unfortunately board did not give any reason why having decided like this. Also, there was no discussion between board and Alexander before this motion, so this decision came very sudden - especially because the communication of this motion was done only by january 30th, 2015 21:19 CET. On this week-end the FOSDEM in Brussel took place, where I (Alexander) participated, so I couldn't get this information before the end of the event. While other board members and involved people were attending as well, nobody talked to me (Alexander) about this motion. While this is, of course, totally legal, this is no good behaviour.
Software Development Team