R Programming
- R Weekly 2024-W33 Snakes & Ladders, Tables, Deutschland Tourrweekly.org R Weekly 2024-W33 Snakes & Ladders, Tables, Deutschland Tour
Weekly News in the R Community
- R Weekly 2024-W29, webrcli & spidyr, compare slider widgetrweekly.org R Weekly 2024-W28, webrcli & spidyr, compare slider widget
Weekly News in the R Community
- R Weekly 2024-W10 patching R, data.table survey, Doom plotsrweekly.org R Weekly 2024-W10 patching R, data.table survey, Doom plots
Weekly News in the R Community
- datasauRus: Datasets from the Datasaurus Dozencran.r-project.org datasauRus: Datasets from the Datasaurus Dozen
The Datasaurus Dozen is a set of datasets with the same summary statistics. They retain the same summary statistics despite having radically different distributions. The datasets represent a larger and quirkier object lesson that is typically taught via Anscombe's Quartet (available in the 'datasets...
- Python Rgonomics | Emily Riedereremilyriederer.netlify.app Python Rgonomics | Emily Riederer
Switching languages is about switching mindsets - not just syntax. New developments in python data science toolings, like polars and seabornโs object interface, can capture the โfeelโ that converts from R/tidyverse love while opening the door to truly pythonic workflows
cross-posted from: https://programming.dev/post/8257343
> Emily Riederer Writes: > > > Switching languages is about switching mindsets - not just syntax. New developments in python data science toolings, like polars and seabornโs object interface, can capture the โfeelโ that converts from R/tidyverse love while opening the door to truly pythonic workflows > > > Just to be clear: > > > > - This is not a post about why python is better than R so R users should switch all their work to python > > - This is not a post about why R is better than python so R semantics and conventions should be forced into python > > - This is not a post about why python users are better than R users so R users need coddling > > - This is not a post about why R users are better than python users and have superior tastes for their toolkit > > - This is not a post about why these python tools are the only good tools and others are bad tools > > > # The Stack > > > > WIth that preamble out of the way, below are a few recommendations for the most ergonomic tools for getting set up, conducting core data analysis, and communication results. > > > > To preview these recommendations: > > > > ### Set Up > > > > - Installation: pyenv > > - IDE: VS Code > > > > ### Analysis > > > > - Wrangling: polars > > - Visualization: seaborn > > > > ### Communication > > > > - Tables: Great Tables > > - Notebooks: Quarto > > > > ### Miscellaneous > > > > - Environment Management: pdm > > - Code Quality: ruff > > Read Python Rgonomics
- Conda is moving to Mastodon & LinkedIn | conda.org/blogconda.org Conda is moving to Mastodon & LinkedIn | conda.org
Conda is retiring its Twitter account. Please join us on Mastodon and LinkedIn
cross-posted from: https://discuss.online/post/4110869
> Conda (@conda@fosstodon.org) writes: > > > Conda is moving our social media presence from Twitter/X to Mastodon and LinkedIn at the start of 2024. It's past time to move into spaces that are welcoming and more in line with our community values. Going forward, you can find us at > ๐ @conda@fosstodon.org (https://fosstodon.org/@conda) > ๐ Conda Community on LinkedIn > > Read Conda is moving to Mastodon & LinkedIn | conda.org/blog > > # Conda (Software) > > Conda provides package, dependency, and environment management for any language. > > Using conda provides a streamlined approach to package management, platform compatibility, environment isolation, and access to an extensive package ecosystem. It is particularly beneficial for data scientists, researchers, and developers working with diverse software requirements across different projects. > > # Conda Community > > The "conda" community is made up of millions of users, packaging maintainers and tool developers. Conda is not a single organization but rather a concerted effort of many different organizations, all devoted to the mission of providing easy access to various types of free software regardless of the operating system or programming language. > > We firmly believe that everyone belongs in open-source, and we want to start by thanking you for taking the time to read this page. What follows is a high level summary of all the projects and organizations which make up the conda community with links provided where you can learn more or get involved yourself. > The many meanings of "conda" > > Traditionally associated with the Anaconda distribution, nowadays the term "conda" refers to more than just a package manager or a software repository. Its many definitions also encompass community packaging efforts like conda-forge and bioconda, as well as new tools developed in the Mamba and conda-incubator organizations. All these efforts show that the conda ecosystem is no longer defined by a single actor and continues to grow and thrive. > > Organizations on GitHub include: > > - @conda, plus Anaconda, Inc. efforts like @AnacondaRecipes, @anaconda-distribution, @ContinuumIO > - @conda-forge, @regro > - @conda-incubator & @conda-tools > - @mamba-org > - @bioconda > > Some tools you might be familiar with are conda or conda-build themselves but also community efforts like mamba, boa, setup-miniconda, conda-lock or conda-tree, among many more. > > Read more about the conda community.
- Python equivalent R code, some nuance concerning RStudio
2023-12-29 by Novica Nakov:
> I notice that
Ctrl+Enter
for running the code inPython
and inR
is not the same thing.Read the whole article
- Free R-Manual in German languagewww.produnis.de Statistik mit R und RStudio
Dieses Buch soll als Nachschlagewerk fรผr statistische Fragestellungen bezรผglich der freien Software R dienen.
If you are looking for a free R Manual in german language, check out my Book. The PDF-Version has about 450pages.
- [The Art of] Regression and other stories
cross-posted from: https://programming.dev/post/7032449
> This will likely be the last (for some time) of my posts about learning resources for Statistical methods and underlying theories for Data Science and Machine Learning foundations. > > Regression and Other Stories by Andrew Gelman, Jennifer Hill, and Aki Vehtari, including the ( R ) code and data for the examples. > > The author of the linked review seems generally positive about the text, though they noted some concerns. > > I'm least likely to use this as my primary resource going forward, in part due due an enthusiastic recommendation for Statistical Rethinking. But it looks like promising supplemental resource to bridge that gap between theory and application.
- Statistical Rethinking - A Bayesian Course with Examples in R and Stan (and PyMC3, brms, and Julia too)github.com GitHub - rmcelreath/stat_rethinking_2024
Contribute to rmcelreath/stat_rethinking_2024 development by creating an account on GitHub.
cross-posted from: https://programming.dev/post/6990204
> Richard McElreath has made his course materials available on GitHub. > > However, the course follows the 2nd edition of McElreath's book Statistical Rethinking which is not available in a free digital format. > > After watching the first lecture in the Statistical Rethinking 2023 YouTube Playlist, I might go ahead and purchase the text and use this course instead of Trevor Hastie and Rob Tibshirani's An Introduction to Statistical Learning (with Applications in R or Python) course. > > I also like that this resource has made an explicit attempt to provide code examples in Julia as well as the more popular Python and R. > > I wasn't sure who Richard McElreath was so I did a quick search which revealed his position as Director of the Department of Human Behavior, Ecology and Culture at the Max Planck Institute for Evolutionary Anthropology in Leipzig. >
- An Introduction to Statistical Learning (with Applications in R or Python)
cross-posted from: https://programming.dev/post/6942085
> ### Book (Free) > > The resource on statistical methods recommended to me the most has been An Introduction to Statistical Learning (with Applications in R or Python) by Gareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani, and Jonathan Taylor. Its free to download and has been kept up to date. (The latest edition is from 2022.) > > ### Online Course (Free with optional payment for "Verified Track") > > For those that prefer a structured online course StanfordOnline: Statistical Learning with R by Trevor Hastie and Robert Tibshirani uses An Introduction to Statistical Learning (with Applications in R) as the course textbook. > > ### More In-Depth Book > > Individuals with advanced training in the mathematical sciences may wish to use The Elements of Statistical Learning (Data Mining, Inference, and Prediction) by Trevor Hastie, Robert Tibshirani, and Jerome Friedman which provides a more comprehensive and detailed treatment of a wider range topics in statistical learning. >
- Packages built with Rcpp dependency failing R-devel checks at CRANgithub.com A fresh -Wformat-security issue under r-devel ยท Issue #1287 ยท RcppCore/Rcpp
Update 2023-11-28: If you came here because of a similar message in your package please read on and see particularly this comment below for the fairly simple fix. While working on an update for RQu...
If you, like me, maintain any package in CRAN with Rcpp dependency, be aware checks in R-devel are failing. The fix is simple (is in the bug report linked), but, at least in my case, must be done before 12/12 or the packages will be removed from CRAN.
- Paper introducing softbib, package that automatically generates bibliography of all packages usedwww.cambridge.org Software Citations in Political Science | PS: Political Science & Politics | Cambridge Core
Software Citations in Political Science - Volume 56 Issue 3
- Z-Test in R: A Tutorial on One Sample & Two Sample Z Testswww.marsja.se Z Test in R: A Tutorial on One Sample & Two Sample Z Tests
Z test in R: Learn one-sample & two-sample Z-tests, hypothesis testing, practical examples (Base R & BSDA).
- Does subsetting (matrices or arrays) always perform a partial copy?
Some large datasets are pushing memory and some functions I'm writing to the limit. I wanted to ask some questions about subsetting, of matrices and arrays in particular:
-
Does defining a variable as a subset of another lead to copy? For instance
x <- matrix(rnorm(20*30), nrow=20, ncol=30) y <- x[, 1:10]
Some exploration withobject_size
frompryr
seems to indicate that a copy is made wheny
is created, but I'd like to be sure. -
If I enter a subset of a matrix/array as argument to a function, does it get copied before the function is started? For instance in
x <- matrix(rnorm(20*30), nrow=20, ncol=30) y <- dnorm(0, mean=x[,1:10], sd=1)
I wonder if the data inx[,1:10]
are copied and then given as input todnorm
.
I've heard that
data.table
allows one to work with subsets without copies being made (unless necessary), but it seems that one is constrained to two dimensions only โ no arrays โ that way.Cheers!
-
- Are there any generalized data communiyies on lemmy?
I used to frequent r/analytics and r/datascience which are more broad than instances dedicated to a technology or language.
- print to console tables that can be easily copied and pasted to Excel
There is a function
format_csv
in packagereadr
, which outputs csv formated output to string. It can be used as``` to_console<-function(dta){ cat(readr::format_csv(dta)) }
dta |> someoperation() |> to_console() ```