not_IO@lemmy.blahaj.zone to Science Memes@mander.xyzEnglish · 3 days agohow things become sciencelemmy.blahaj.zoneimagemessage-square122linkfedilinkarrow-up11.01Karrow-down18file-text
arrow-up11Karrow-down1imagehow things become sciencelemmy.blahaj.zonenot_IO@lemmy.blahaj.zone to Science Memes@mander.xyzEnglish · 3 days agomessage-square122linkfedilinkfile-text
https://www.nature.com/articles/d41586-026-01100-y https://bsky.brid.gy/r/https://bsky.app/profile/did:plc:s6yp6jam5og3tftozaw7pjth/post/3mj34sn6kyk25
minus-squarepercent@infosec.publinkfedilinkEnglisharrow-up5·2 days agoThere are huge public datasets that are often used for pretraining. Common Crawl and C4 are probably the most prominent, but there are others. There are also big public datasets available for fine-running and instruction tuning. The open weight models are getting pretty powerful, thanks to some Chinese labs.
There are huge public datasets that are often used for pretraining. Common Crawl and C4 are probably the most prominent, but there are others.
There are also big public datasets available for fine-running and instruction tuning.
The open weight models are getting pretty powerful, thanks to some Chinese labs.