From e4ac668478cdcf916404dce0b4802c2ebfba690b Mon Sep 17 00:00:00 2001
From: Gordon McCann <43148247+gwm17@users.noreply.github.com>
Date: Sat, 1 Oct 2022 22:53:32 -0400
Subject: [PATCH] Updated Home (markdown)

---
 Home.md | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/Home.md b/Home.md
index e08ec04..97dcd14 100644
--- a/Home.md
+++ b/Home.md
@@ -12,19 +12,19 @@ The pipeline for event building is as follows: time-shifting and time-ordering,
 
 ## Time-Shifting and Time-Ordering
 
-To minimize dead time, as was previously mentioned, it is in general beneficial to minimize the width of coincidence windows. To this end, the event builder allows the user to apply shifts to the timestamps of hits on a channel by channel (or board by board) basis. In this way, a detector can be shifted so that the timestamps of two detectors can be synchronized, thus meaning that the coincidence window for the detectors needs only be wide enough to capture the timing distribution rather than the time offset as well as the distribution between two detectors. Typically, in the SESPS-SABRE setup, shifts are applied to anode data and SABRE data such that these events are synchronized with scintillator data, thus making the delay-line data the only significant timing event in the slow coincidence window.
+To minimize dead time, it is in general beneficial to minimize the width of coincidence windows. To this end, the event builder allows the user to apply shifts to the timestamps of hits on a channel by channel (or board by board) basis. In this way, a detector can be shifted so that the timestamps of two detectors can be synchronized, thus meaning that the coincidence window for the detectors needs only be wide enough to capture the timing distribution rather than the time offset as well as the distribution between two detectors. Typically, in the SESPS-SABRE setup, shifts are applied to anode data and SABRE data such that these events are synchronized with scintillator data, thus making the delay-line data the only significant timing event in the slow coincidence window.
 
-Data from the CoMPASS DAQ system is typically given in a raw binary format for each individual digitizer channel (in the full SESPS-SABRE setup this results in around 145 files). These files are by definition time-ordered, however for event building purposes we need to combine these files in a time-ordered way so that coincidence analysis can be performed. 
+Data from the CoMPASS DAQ system is typically given in a raw binary format for each individual digitizer channel (in the full SESPS-SABRE setup this results in around 145 files). Each individual file (detector channel) is sorted in time, but we need to combine them together into a complete time-ordered data set. This is done by taking the earliest (top) hit of each file and comparing their timestamps. The earliest hit across all files is then sent to the next stage of building.
 
 ## Slow Sorting and Fast Sorting
 
-Once the data is properly ordered in time, the data must be sorted in to built events. The most general attempt at this is referred to as slow sorting by this code. Slow sorting is where all data that falls within a coincidence window is taken and placed into a single built event. This coincidence window is often referred to as the slow window. "Slow" comes from the fact that this is the largest window used by the program, so this sorting takes place over the largest time-scales, and is therefore slow. There are a few important things to note about slow sorting in the program. Foremost is that it does not have a master trigger. That is, data from any detector channel can start an event. This is essentially a requirement for using the time-shifts outlined above, as well as optimizing the slow window size. The window stays open until a hit with a timestamp that occurs outside of the slow window is found. Then, that hit starts the new built event and the previous event is flushed out to the next stage of the pipeline. Also, the slow sort algorithm does _not_ discard any data. The built event from slow sort is comprised of dynamically allocated arrays (read std::vector) thus meaning that in principle the slow sort incurs no intrinsic dead time other than from fragmentation of events. 
+Once the data is properly ordered in time, the data must be sorted in to built events. The most general attempt at this is referred to as slow sorting by this code. Slow sorting is where all data that falls within a coincidence window is taken and placed into a single built event. This coincidence window is often referred to as the slow window. "Slow" comes from the fact that this is the largest window used by the program, so this sorting takes place over the largest time-scales. There are a few important things to note about slow sorting in the program. Foremost is that it does not have a master trigger. That is, data from any detector channel can start an event. This is essentially a requirement for using the time-shifts outlined above, as well as optimizing the slow window size. The window stays open until a hit with a timestamp that occurs outside of the slow window is found. Then, that hit starts the new built event and the previous event is flushed out to the next stage of the pipeline. Also, the slow sort algorithm does _not_ discard any data. The built event from slow sort is comprised of dynamically allocated arrays (read std::vector) thus meaning that in principle the slow sort incurs no intrinsic dead time other than from fragmentation of events. 
 
 Fast sorting is an optional secondary stage of coincidence analysis, aimed at resolving multi-hit events. In general, each coincidence event should contain at most a single hit from a given detector channel (there are cases where this is not true, however they are rare for the SESPS-SABRE setup), but if the slow window is significantly wider than the typical time correlation for two hits there is a possibility that two hits for a given channel may be put into a single  built event. To resolve this hit degeneracy, the user may input additional time correlation information for specific channels. Fast refers to the fact that these windows must be shorter than the slow window. Specifically, for the SESPS-SABRE code, the fast sort provides the option to enforce a window on focal plane scintillator-anode data and then on focal plane scintillator-SABRE data, as the scintillator and SABRE tend to be much faster detectors than the focal plane ion chamber. These windows are referred to as the fast ion-chamber window and fast SABRE window respectively. It is important to note that fast sorting is _optional_ and may need tweaking on an experiment by experiment basis, and is not recommended to be run until the time-shifts and the slow sorting method have been tested and run successfully. Additionally, it should be emphasized that the default fast sorting stage _can_ dump data. It requires that an ion-chamber anode hit be present in order to be saved; in general this means that scintillator-only events (scintillator singles) or SABRE singles will be dumped if the fast sorting is done. 
 
 ## Basic Analysis
 
-Technically speaking, after the sorting stages, the event building is complete and the next, much more complicated and experiment specific data analysis stage should take over. However, due to some limitations of the CoMPASS software with online data analysis, as well as to provide a method to test the success of event builder, the event builder can pass the data on to a very  basic analysis class. This analysis is _not_ meant to be used as a final analysis program; it does not have very many safety measures and is typically too simple and too difficult to modify for most experiments. There are some key features that outline both the use and drawbacks of this analysis. First is that _any_ degeneracy in data must be resolved for use with the analysis. Consider the following scenario: the front left delay line signal is slightly noisy and inside the built event there are two front left delay line hits. To calculate a calibrated focal plane position one must subtract the timestamp of the left and right signal for a given delay line. How then is the analysis to select which front left delay signal goes with the single front right delay signal? To make the code generally applicable it employs a very simple solution of taking whichever hit occurred first, but one can imagine all  of the reasons why this is not desirable for specific experiment cases. This first-in selection scheme is employed for _every_ detection channel that gets converted into analyzed data. In general, the only data member that continues to maintain the earlier policy of not dumping data is the SABRE array, however, a downscaled version of the SABRE array is then required to be used with the online plotting. Additionally, due to the dynamic nature of the sorted data, checks must be made upon the validity of the data at analysis time. This in turn induces a performance penalty as more and more complicated analyses are performed, which reduces the usefulness of adding more analysis steps. Finally, the data analysis tends to bloat the file size. Each additional analyzed parameter induces an increase in the written data size, and ROOT does not support optional writing. That is: even if an event does not have a right scintillator hit, all of the right scintillator data member will still be written to the file with an illegal default value (typically something like -1). In a more specialized analysis, only relevant data would be written, but due to the general nature of this analysis along with its focus on providing sanity checks for the event building process, a lot of experimentally irrelevant data will be written. 
+Technically speaking, after the sorting stages, the event building is complete and the next, much more complicated and experiment specific data analysis stage should take over. However, due to some limitations of the CoMPASS software with online data analysis, as well as to provide a method to test the success of event builder, the event builder can pass the data on to a very  basic analysis class. This analysis is _not_ meant to be used as a final analysis program; it does not have very many safety measures and is typically too simple and too difficult to modify for most experiments. There are some key features that outline both the use and drawbacks of this analysis. First is that _any_ degeneracy in data must be resolved for use with the analysis. Consider the following scenario: the front left delay line signal is slightly noisy and inside the built event there are two front left delay line hits. To calculate a calibrated focal plane position one must subtract the timestamp of the left and right signal for a given delay line. How then is the analysis to select which front left delay signal goes with the single front right delay signal? To make the code generally applicable it employs a very simple solution of taking whichever hit occurred first, but there are many experiments where this is not the correct approach. This first-in selection scheme is employed for _every_ detection channel that gets converted into analyzed data. In general, the only data member that continues to maintain the earlier policy of not dumping data is the SABRE array, however, a down-scaled version of the SABRE array is then required to be used with the online plotting. Additionally, due to the dynamic nature of the sorted data, checks must be made upon the validity of the data at analysis time. This in turn induces a performance penalty as more and more complicated analyses are performed, which reduces the usefulness of adding more analysis steps. Finally, the data analysis tends to bloat the file size. Each additional analyzed parameter induces an increase in the written data size, and ROOT does not support optional writing. That is: even if an event does not have a right scintillator hit, all of the right scintillator data member will still be written to the file with an illegal default value (typically something like -1). In a more specialized analysis, only relevant data would be written, but due to the general nature of this analysis along with its focus on providing sanity checks for the event building process, a lot of experimentally irrelevant data will be written. 
 
 ## Other Operations
 
@@ -36,19 +36,22 @@ Sometimes there is data collected where the only thing of interest is the number
 
 # Installation, Building, and Setting up the Workspace
 
-The event builder uses the [premake](https://premake.github.io/) build system. To install premake, simply download the release from the webpage and install it to a location on your path (don't bother trying to build it from source).
+The event builder uses CMake and requires CMake version greater than or equal to 3.16.
 
-The only external dependence for this repository is the ROOT Data Analysis Package. Due to the large size and complexity of ROOT, ROOT is not included as a submodule, and rather the user is relied upon to properly install and setup their own ROOT package. This code has been primarily tested and validated using ROOT6.14, so mileage may vary with older versions.
+The only external dependence for this repository is the ROOT Data Analysis Package. Due to the large size and complexity of ROOT, ROOT is not included as a submodule, and rather the user is relied upon to properly install and setup their own ROOT package. Please note, the event builder uses C++17, and as such the ROOT used _must_ also have been built using C++17. This can be check on MacOS or Linux using the root-config tool.
 
-Building the code is fairly straightforward. Obtain the repository from GitHub using the command `git clone --recursive https://github.com/sesps/SPS_SABRE_EventBuilder.git`. After obtaining the repository from GitHub, use premake to build the project files for your system. This will also involve setting the paths for your ROOT libraries in the `premake5.lua` file (see the README for details). In a typical Linux system the command `premake5 gmake2` will build makefiles to use with GNU Make. Then build the code using your chosen system (i.e. make, MSVC, Xcode). This will build two executables in the `bin` directory of the repository, one called `EventBuilder` and one called `EventBuilderGui`. The only difference between these two, is that the non-GUI version is a pure commandline application while the other has a GUI built in the ROOT environment.
+Building the code is fairly straightforward. Obtain the repository from GitHub using the command `git clone --recursive https://github.com/sesps/SPS_SABRE_EventBuilder.git`. After obtaining the repository from GitHub, use CMake to build the project. See the README for an example of running CMake.
 
-Also included in the `bin` directory is a bash script called `archivist`. This script is for use at the FSU online DAQ, and is mostly irrelevant in other use cases.
+## Workspaces
+The event builder needs to know where the CoMPASS binary data is stored and the relevant directories for writing event built data. The event builder can generate a workspace on its own. In the EventBuilderGui, simply select a workspace folder, and the event builder will generate all required directories in that workspace folder. (It can also generate the workspace folder itself if you enter a directory that does not already exist.)
 
-Finally, to run the code the user must setup the proper directory environment referred to as the workspace. An example of what the workspace environment should contain is shown in the `example` directory of the repository. `raw_binary` should contain raw binary archives (read: .tar.gz) of CoMPASS runs  that follow the format `run_#.tar.gz`. The code unpacks the archive to the `temp_binary` directory, reads the data, and saves an event built file to the proper directory based on the type of analysis the user requests. The `temp_binary` directory is then cleaned so that it can be used for the next run. Note that in general it is best to have the workspace be located somewhere other than the repository usually with a head directory name that indicates what experiment the included data is associated with.
+All CoMPASS binary data should then be moved to the raw_binary directory and stored as .tar.gz files of all the individual .BIN files made by CoMPASS. The naming should follow `run_#.tar.gz`.
+
+Also included in the `bin` directory is a bash script called `archivist`. This script will move data from CoMPASS projects to a workspace, tarring and zipping them appropriately.
 
 # Running the Code
 
-The basics of running the code are mostly contained within the input file, of which there is an example in the repository called `input.txt`. The input file asks for the location of a workspace, a channel map file which gives a list of the digitizer channels and the associated detector information, a board offset file which lists the time-shifts to be applied to specific channels,  a scaler file which lists any digitizer channels to be taken as scalers, and a cut list file which lists any cuts to be used with the plotter tool. Examples of these files may be found in the `etc` directory. The SESPS-SABRE event builder will also ask for reaction information so that a kinematic correction can be applied to analyzed focal plane data. This includes specifying atomic numbers for use in looking up nuclear masses. Note that the code uses the 2017 AMDC mass evaluation data; if you input information which requests a nuclear mass not included in that data, an error will occur in the code. Finally, the input file asks for window sizes as well as a range of run numbers over which to the program will be run.
+The basics of running the code are mostly contained within the input file. The input files are YAML files which contain the location of a workspace, a channel map file which gives a list of the digitizer channels and the associated detector information, a board offset file which lists the time-shifts to be applied to specific channels,  a scaler file which lists any digitizer channels to be taken as scalers, and a cut list file which lists any cuts to be used with the plotter tool. Examples of these files may be found in the `etc` directory. The SESPS-SABRE event builder will also ask for reaction information so that a kinematic correction can be applied to analyzed focal plane data. This includes specifying atomic numbers for use in looking up nuclear masses. Note that the code uses the 2017 AMDC mass evaluation data; if you input information which requests a nuclear mass not included in that data, an error will occur in the code. Finally, the input file asks for window sizes as well as a range of run numbers over which to the program will be run.
 
 If the commandline executable (`EventBuilder`) is being used, format is the following:
 * `./bin/EventBuilder <evb_operation> <your_input_file>`
@@ -60,6 +63,4 @@ If the gui executable (`EventBuilderGui`) is being used, format is simply:
 
 The input file can then be loaded using the `File->Load` menu, or the user can manually enter the input parameters to the GUI. The GUI also provides functionality for saving an input file for the currently set parameters using the `File->Save` menu. The event building operation is then selected using the drop down menu. 
 
-Note that both executables should be run from the top-level repository directory, _not_ from the `bin` directory. This is standard practice to define the path to specific external data files, namely the workspace and nuclear mass datafile. If the program is run from the `bin` directory the behavior of much of the program is undefined. 
-
-Finally, as a convenience, a data file from a 12C(3He, alpha) run is included in the `example` so new users can test out the functionality of the event builder. 
\ No newline at end of file
+Note that both executables should be run from the top-level repository directory, _not_ from the `bin` directory. This is to define the path to specific external data files, namely the workspace and nuclear mass datafile.
\ No newline at end of file