IRC meeting summary for 2017-03-30

Overview


Notes / short topics

  • BlueMatt notes 2 out of 6 of last weeks blocked and review-needed PRs got merged, which can be better. Wumpus created a high-priority for review github project page for the PRs that are mentioned to have priority in the meetings.
  • There have been 11 merges tagged 0.14.1 and 3 open PRs. Once those are in 0.14.1 should be good to go.

Main topics

  • Slow unit tests
  • Dealing with abortnode / ConnectTip / DisconnectTip failures
  • High priority review

Slow unit tests

background

Bitcoin Core provides a makefile target check that runs the project’s unit tests. The project has over time written more and more integration tests executed through the RPC interface, which are run automatically by the Travis Continuous Integration (CI) server that tests every Bitcoin Core pull request and which can manually be run by executing, qa/pull-tester/rpc-tests.py

As previously discussed in the 2017-03-16 meeting, these tests as a whole might currently take too long.

meeting comments

Wumpus made an overview of the slowest unit tests. Some of them have already been worked on or have PRs to make them faster.

We could also introduce an -extended mode for the unit tests, which does extra thorough tests that shouldn’t run every time. The extended mode should be part of the release process (and ran by gitian) and/or once a day on master.

Jonasschnelli has a build server with a nice web UI which does gitian builds every day at https://bitcoin.jonasschnelli.ch/.

Jnewbery notes the Travis CI service is currently failing because we’ve set it to run the extended tests once per day, so we’re flushing out all the extended tests that have always failed on Travis. Once PR #10114 and #10072 are merged these daily runs should pass Travis.

meeting conclusion

  • For the standard make check have a guideline of maximum ~1 second per test case, have an extended mode for unit tests with more extensive tests.

Dealing with abortnode / ConnectTip / DisconnectTip failures

background

Sdaftuar has an open PR (#9208) which improves performance after a chain reorg(anization), where a node discovers a new longest valid chain which excludes a block previously thought as being the longest valid chain (which will be orphaned). Currently we try to add transactions from each orphaned block back to the mempool, even though it’s likely many of those transactions will reoccur in the newly discovered blocks. #9208 stores those transactions in a separate “disconnect pool” for later processing.

meeting comments

BlueMatt brought up some edgecases in cases when ConnectTip or DisconnectTip return false where we now assert() instead of AbortNode(). Some broader discussion ensues on when to use AbortNode() and when Abort()/assert() and what’s the best way to alert the user an error has occured. AbortNode() allows us to exit with a message to inform the user, so ideally only critical errors should result in Abort()/Assert().

meeting conclusion

  • Rename AbortNode() to ShutdownSoon() and make sure disk corruption uses something different.

High priority review

All the PRs tagged with 0.14.1 should have priority

Sipa adds he’d like to see #9792 (FastRandomContext improvements and switch to ChaCha20) get in at some point to further remove the dependency on OpenSSL.

Gmaxwell proposes to re-open his PR #9424 which changes the logging categories to boolean flags instead of strings. This would make the use of PR’s like #10123, which allows you to exclude certain components from the debug log, easier. Cfields adds he’d like to do something simular for net messages.

Comic relief

wumpus             if BlueMatt can make it work faster that's great, but don't silently kill the program on every error
gmaxwell           wumpus: how about every other error?

9:48  BlueMatt     so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?
...
9:53  BlueMatt     <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?
...
9:57  BlueMatt     ok, soooo, acks on:<BlueMatt> <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?
9:58  jeremyrubin  BlueMatt: maybe if you paste it again
9:58  BlueMatt     jeremyrubin: ok, <BlueMatt> ok, soooo, acks on:<BlueMatt> <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?

jtimon             it seems it's time to abort the meeting
wumpus             #endmeeting
BlueMatt           wumpus: we need to change that to #abort()
gmaxwell           But I wanted to cleanly flush!

Participants

IRC nick Name/Nym
gmaxwell Gregory Maxwell
wumpus Wladimir van der Laan
jonasschnelli Jonas Schnelli
sipa Pieter Wuille
BlueMatt Matt Corallo
jtimon Jorge Timón
cfields Cory Fields
achow101 Andrew Chow
jeremyrubin Jeremy Rubin
sdaftuar Suhas Daftuar
MarcoFalke Marco Falke
jnewbery John Newbery
morcos Alex Morcos
instagibbs Gregory Sanders
Chris_Stewart_5 Chris Stewart

Disclaimer

This summary was compiled without input from any of the participants in the discussion, so any errors are the fault of the summary author and not the discussion participants.