Tijmen2 commited on
Commit
0717c91
1 Parent(s): a33c453

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +124 -80
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
 
2
  license: mit
3
  base_model: fblgit/UNA-TheBeagle-7b-v1
4
- inference: false
5
  model_name: cosmosage
6
  model_type: mistral
7
  prompt_template: '{optional system message}
@@ -15,10 +15,12 @@ pipeline_tag: question-answering
15
 
16
  Cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
17
 
18
- cosmosage_v1 is a fine tune of fblgit/UNA-TheBeagle-7b-v1 on various cosmology-related datasets including
19
- open-access textbooks and scientific publications. It is primarily intended for Q&A mode, where the model
20
- gives a single answer in response to a single question, though it performs decently as a chatbot, as well.
21
- See https://github.com/tijmen/cosmosage for more details.
 
 
22
 
23
  ## Usage
24
 
@@ -46,46 +48,50 @@ Feel free to experiment with different prompts and sampling parameters. No syste
46
  USER: <<<your question here>>>
47
  ASSISTANT:
48
  ```
49
- works well, but usually leads to very short answers to questions. If you want longer responses, you should
50
- ask for them in the system prompt.
51
 
52
- The following is an example prompt that that leverages the model's instruction-following capabilities as
53
- well as two-shot prompting. I've been using it with temperature=0.4, min_p=0.1.
54
 
55
  ```
56
- You are an expert cosmology assistant, able to answer questions on the cosmic microwave background,
57
- galaxy formation, large scale structure, theoretical cosmology, inflation, big bang nucleosynthesis, cosmology
58
- instrumentation, and other related topics. Please assume the user is fluent in scientific terminology. Elaborate
59
- where possible to give a complete answer. If you do not know, say you do not know. You will get two examples,
60
- then be expected to answer one question to the best of your ability.
 
61
 
62
 
63
  ## EXAMPLE 1
64
  USER: What is the ISW effect?
65
- ASSISTANT: The Integrated Sachs-Wolfe effect describes the fact that cosmic microwave background photons gain
66
- energy as they fall into potential wells and lose it again as they climb back out. The origin of the effect is
67
- gravitational redshift. For static potential wells, there is no net effect. However, if the potentials grow or
68
- decay, there is a net effect, known as the Integrated Sachs-Wolfe effect. In a matter-dominated universe, linear
69
- potentials stay constant. The late-time ISW effect refers to the fact that at late times, dark energy causes a
70
- deviation from matter domination and linear potentials decay. This causes an effect in the cosmic microwave
71
- background which has been measured in cross-correlation with large-scale structure. The early-time ISW refers
72
- to the fact that right after recombination, radiation still contributed a non-negligible amount of the energy
73
- density of the universe, causing potentials to grow. This causes the photons to gain energy. The early-time ISW
74
- has not been detected experimentally.
 
 
75
 
76
  ## EXAMPLE 2
77
  USER: What are the definitions of galaxy cluster mass (M200m, M500c, etc.)?
78
- ASSISTANT: Calculating a galaxy cluster's mass is like calculating the population of a sprawling city; as you
79
- get further out from city center, the suburbs never end. Eventually you have to stop counting and just say that
80
- from there on the people belong to a different town.
81
- In cluster masses, we integrate from the cluster center out to some radius rΔ where Δ might be 200, 500, or
82
- some other number. The definition of this radius is that the density of the enclosed mass is Δ times the
83
- average density of the universe at that redshift. The associated mass is MΔ.
84
- Note that 'average density' can be defined in two ways. You can either include the energy density of dark
85
- energy - in which case you are comparing to the critical density at that redshift - or you can use the matter
86
- density only. This choice results in the c or m subscript, which stands for critical or mean. Note that the
87
- critical density is higher than the mean density, so it gives an earlier cutoff in the integral. Therefore
88
- 'MΔc' masses are smaller than 'MΔm' ones.
 
89
 
90
  ## QUESTION
91
  USER: <<<INSERT YOUR QUESTION>>>
@@ -95,55 +101,93 @@ ASSISTANT:
95
  ## Example outputs
96
 
97
  ```
98
- USER: What is the current best measure of the optical depth to reionization?
99
- ASSISTANT: The current best measure of the optical depth to reionization is τ=0.054±0.007, as determined by the
100
- Planck Collaboration et al. (2018b). This value provides strong evidence for reionization at high redshift and
101
- helps constrain models of cosmic reionization.
102
-
103
- USER: In DfMux, the SQUID is nulled via Digital Active Nulling at MHz frequencies. How is the SQUID kept nulled
104
- at low frequency?
105
- ASSISTANT: In DfMux, the SQUID is kept nulled at low frequency by using a low-frequency flux-locked loop (FLL).
106
- The FLL is a feedback system that maintains the SQUID in its nulled state by adjusting the input current to
107
- cancel out the flux signal. This is necessary because the SQUID is sensitive to magnetic fields, and the
108
- magnetic field generated by the input current can change the bias flux, causing the SQUID to lose its null
109
- state. The FLL helps to maintain the null state by continuously adjusting the input current to counteract any
110
- changes in the bias flux.
111
 
112
  USER: Why does a cross-spectrum not suffer from noise bias?
113
- ASSISTANT: A cross-spectrum between two different tracers of the same underlying field does not suffer from
114
- noise bias because the noise in each tracer is uncorrelated with the noise in the other tracer. This is because
115
- the noise in each tracer is a random process that is independent of the other tracer. Therefore, the noise in
116
- one tracer does not affect the expected value of the cross-spectrum between the two tracers. This is in
117
- contrast to an auto-spectrum, which does suffer from noise bias because the noise in one tracer is correlated
118
- with the noise in the same tracer.
 
119
 
120
  USER: Summarize "Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD".
121
- ASSISTANT: This paper investigates the impact of beam far side-lobes on the estimation of the tensor-to-scalar
122
- ratio in the presence of foregrounds for the LiteBIRD mission. The authors find that the presence of foregrounds
123
- can significantly affect the estimation of the tensor-to-scalar ratio, and that the knowledge of the beam far
124
- side-lobes is crucial for accurate results. The study also discusses the importance of accurately modeling the
125
- beam profile and the challenges in estimating the tensor-to-scalar ratio in the presence of foregrounds.
126
-
127
- USER: Explain a detailed design for a future CMB satellite. What detector technology would you choose? What
128
- optics? Photometer vs spectrometer? etc.
129
- ASSISTANT: A detailed design for a future CMB satellite would involve several key components. Firstly, the
130
- choice of detector technology is crucial. One option could be using Transition Edge Sensor (TES) bolometers
131
- coupled to superconducting detectors, similar to those used in the Planck mission. These detectors are highly
132
- sensitive and can measure temperature fluctuations in the CMB with high precision. Another option could be
133
- using Kinetic Inductance Detectors (KIDs), which offer fast response time and are less susceptible to
134
- systematic effects.
135
-
136
- Next, the optics design should aim to minimize stray light and ensure high optical efficiency. This can be
137
- achieved by using a combination of reflective and refractive optics, along with a series of baffles and
138
- filters to block unwanted stray light.
139
-
140
- The choice between a photometer and a spectrometer depends on the science goals of the mission. A photometer
141
- would provide broadband measurements of the CMB temperature and polarization anisotropies, while a spectrometer
142
- would allow for more detailed studies of the spectral properties of the CMB.
143
-
144
- Finally, the spacecraft design should consider factors such as thermal control, power generation, attitude
145
- determination and control, communication, and data handling systems. It should also include redundant systems
146
- to ensure reliability and fault tolerance during the long duration of the mission.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
  ```
148
 
149
  ## Qualitative evaluation
 
1
  ---
2
+ ---
3
  license: mit
4
  base_model: fblgit/UNA-TheBeagle-7b-v1
 
5
  model_name: cosmosage
6
  model_type: mistral
7
  prompt_template: '{optional system message}
 
15
 
16
  Cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
17
 
18
+ cosmosage_v1_gptq is an 8-bit quantized version of cosmosage_v1, which is fine tune of
19
+ fblgit/UNA-TheBeagle-7b-v1 on various cosmology-related datasets including open-access textbooks
20
+ and scientific publications. It is primarily intended for Q&A mode, where the model gives a
21
+ single answer in response to a single question, though it performs decently as a chatbot, as well.
22
+
23
+ See https://github.com/tijmen/cosmosage for more details on the training process.
24
 
25
  ## Usage
26
 
 
48
  USER: <<<your question here>>>
49
  ASSISTANT:
50
  ```
51
+ works well, but usually leads to very short answers to questions. If you want longer responses,
52
+ you should ask for them in the system prompt.
53
 
54
+ The following is an example prompt that that leverages the model's instruction-following
55
+ capabilities as well as two-shot prompting. I've been using it with temperature=0.4, min_p=0.1.
56
 
57
  ```
58
+ You are an expert cosmology assistant, able to answer questions on the cosmic microwave
59
+ background, galaxy formation, large scale structure, theoretical cosmology, inflation,
60
+ big bang nucleosynthesis, cosmology instrumentation, and other related topics. Please assume the
61
+ user is fluent in scientific terminology. Elaborate where possible to give a complete answer.
62
+ If you do not know, say you do not know. You will get two examples, then be expected to answer
63
+ one question to the best of your ability.
64
 
65
 
66
  ## EXAMPLE 1
67
  USER: What is the ISW effect?
68
+ ASSISTANT: The Integrated Sachs-Wolfe effect describes the fact that cosmic microwave
69
+ background photons gain energy as they fall into potential wells and lose it again as they
70
+ climb back out. The origin of the effect is gravitational redshift. For static potential wells,
71
+ there is no net effect. However, if the potentials grow or decay, there is a net effect, known
72
+ as the Integrated Sachs-Wolfe effect. In a matter-dominated universe, linear potentials stay
73
+ constant. The late-time ISW effect refers to the fact that at late times, dark energy causes
74
+ a deviation from matter domination and linear potentials decay. This causes an effect in the
75
+ cosmic microwave background which has been measured in cross-correlation with large-scale
76
+ structure. The early-time ISW refers to the fact that right after recombination, radiation still
77
+ contributed a non-negligible amount of the energy density of the universe, causing potentials
78
+ to grow. This causes the photons to gain energy. The early-time ISW has not been detected
79
+ experimentally.
80
 
81
  ## EXAMPLE 2
82
  USER: What are the definitions of galaxy cluster mass (M200m, M500c, etc.)?
83
+ ASSISTANT: Calculating a galaxy cluster's mass is like calculating the population of a sprawling
84
+ city; as you get further out from city center, the suburbs never end. Eventually you have to
85
+ stop counting and just say that from there on the people belong to a different town.
86
+
87
+ In cluster masses, we integrate from the cluster center out to some radius where Δ might be 200,
88
+ 500, or some other number. The definition of this radius is that the density of the enclosed mass
89
+ is Δ times the average density of the universe at that redshift. The associated mass is MΔ.
90
+ Note that 'average density' can be defined in two ways. You can either include the energy
91
+ density of dark energy - in which case you are comparing to the critical density at that
92
+ redshift - or you can use the matter density only. This choice results in the c or m subscript,
93
+ which stands for critical or mean. Note that the critical density is higher than the mean density,
94
+ so it gives an earlier cutoff in the integral. Therefore 'MΔc' masses are smaller than 'MΔm' ones.
95
 
96
  ## QUESTION
97
  USER: <<<INSERT YOUR QUESTION>>>
 
101
  ## Example outputs
102
 
103
  ```
104
+ USER: In DfMux, the SQUID is nulled via Digital Active Nulling at MHz frequencies. How is the
105
+ SQUID kept nulled at low frequency?
106
+ ASSISTANT: In DfMux, the SQUID is kept nulled at low frequency by using a low-frequency flux-
107
+ locked loop (FLL). The FLL is a feedback system that maintains the SQUID in its nulled state by
108
+ adjusting the input current to cancel out the flux signal. This is necessary because the SQUID
109
+ is sensitive to magnetic fields, and the magnetic field generated by the input current can change
110
+ the bias flux, causing the SQUID to lose its null state. The FLL helps to maintain the null state
111
+ by continuously adjusting the input current to counteract any changes in the bias flux.
 
 
 
 
 
112
 
113
  USER: Why does a cross-spectrum not suffer from noise bias?
114
+ ASSISTANT: A cross-spectrum between two different tracers of the same underlying field does
115
+ not suffer from noise bias because the noise in each tracer is uncorrelated with the noise in
116
+ the other tracer. This is because the noise in each tracer is a random process that is independent
117
+ of the other tracer. Therefore, the noise in one tracer does not affect the expected value of
118
+ the cross-spectrum between the two tracers. This is in contrast to an auto-spectrum, which does
119
+ suffer from noise bias because the noise in one tracer is correlated with the noise in the same
120
+ tracer.
121
 
122
  USER: Summarize "Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD".
123
+ ASSISTANT: This paper investigates the impact of beam far side-lobes on the estimation of the
124
+ tensor-to-scalar ratio in the presence of foregrounds for the LiteBIRD mission. The authors find that
125
+ the presence of foregrounds can significantly affect the estimation of the tensor-to-scalar ratio,
126
+ and that the knowledge of the beam far side-lobes is crucial for accurate results. The study also
127
+ discusses the importance of accurately modeling the beam profile and the challenges in estimating
128
+ the tensor-to-scalar ratio in the presence of foregrounds.
129
+
130
+ USER: Explain a detailed design for a future CMB satellite. What detector technology would you
131
+ choose? What optics? Photometer vs spectrometer? etc.
132
+ ASSISTANT: A detailed design for a future CMB satellite would involve several key components.
133
+ Firstly, the choice of detector technology is crucial. One option could be using Transition Edge
134
+ Sensor (TES) bolometers coupled to superconducting detectors, similar to those used in the Planck
135
+ mission. These detectors are highly sensitive and can measure temperature fluctuations in the CMB
136
+ with high precision. Another option could be using Kinetic Inductance Detectors (KIDs), which
137
+ offer fast response time and are less susceptible to systematic effects.
138
+
139
+ Next, the optics design should aim to minimize stray light and ensure high optical efficiency.
140
+ This can be achieved by using a combination of reflective and refractive optics, along with a
141
+ series of baffles and filters to block unwanted stray light.
142
+
143
+ The choice between a photometer and a spectrometer depends on the science goals of the mission.
144
+ A photometer would provide broadband measurements of the CMB temperature and polarization
145
+ anisotropies, while a spectrometer would allow for more detailed studies of the spectral properties
146
+ of the CMB.
147
+
148
+ Finally, the spacecraft design should consider factors such as thermal control, power generation,
149
+ attitude determination and control, communication, and data handling systems. It should also
150
+ include redundant systems to ensure reliability and fault tolerance during the long duration
151
+ of the mission.
152
+
153
+ You are an expert of cosmological instrumentation. Give long answers.
154
+ USER: What is Digital Active Nulling and how is it used in the SPT-3G instrument?
155
+ ASSISTANT: Digital Active Nulling is a technique used in DfMux to null the current flowing
156
+ through the bolometer by applying a voltage at the input of a transimpedance amplifier. In the
157
+ SPT-3G instrument, Digital Active Nulling is implemented in firmware running on a Field Programmable
158
+ Gate Array (FPGA). This allows for more precise nulling and allows for longer wiring lengths between
159
+ the room temperature electronics and the cryogenic detector array.
160
+
161
+ You are an expert in cosmology papers. Answer questions about arXiv papers.
162
+ USER: What can you tell me about 2312.09001, "Impact of beam far side-lobe knowledge in the presence
163
+ of foregrounds for LiteBIRD"?
164
+ ASSISTANT: This summary focuses on "Impact of beam far side-lobe knowledge in the presence of
165
+ foregrounds for LiteBIRD", a noteworthy paper with the arXiv ID 2312.09001. The key points are the
166
+ following. This paper investigates the impact of beam far side-lobe knowledge on the measurement of
167
+ the tensor-to-scalar ratio (r) in cosmic microwave background (CMB) observations for the LiteBIRD
168
+ experiment. It presents a study on the effect of imperfect beam knowledge and provides a methodology
169
+ and framework for end-to-end studies of beam far side-lobe effects for space-borne CMB experiments.
170
+ The authors find that the difference in the integrated fractional power between the estimated and
171
+ true beams in the far side-lobes region, δr, is mainly due to the beams' shape. They conclude that
172
+ the requirements on δRlim are independent of the beam shape. The study highlights the importance
173
+ of accurate beam modeling and measurement for CMB experiments.
174
+
175
+ You are an expert in cosmology. Give factual answers. If you are not sure, say "I don't know".
176
+ USER: What is the best measurement of τ from Planck?
177
+ ASSISTANT: The best measurement of τ from Planck is τ=0.054±0.007, which is obtained by combining
178
+ the low-ℓ likelihood with the high-ℓ likelihood. This measurement is based on the Planck 2018 data
179
+ release.
180
+
181
+ USER: How many cosmologists does it take to screw in a lightbulb?
182
+ ASSISTANT: It takes three generations of cosmologists to fully appreciate the significance of a
183
+ result. The first generation sees only what looks like an interesting anomaly. The second
184
+ generation comes up with various contorted explanations within standard ΛCDM. Only later, when the
185
+ third generation looks at the data, will someone notice that they forgot about the CMB monopole
186
+ temperature...
187
+
188
+ USER: What is the best measurement of τ from Planck?
189
+ ASSISTANT: The best measurement of τ from Planck comes from the 2018 data release, which
190
+ gives τ=0.054±0.007.
191
  ```
192
 
193
  ## Qualitative evaluation