Brandon Rohrer
Data scientist
 Report this post
Numba rule of thumb #7: Pass return variables in as input arguments.This avoids initializing a fresh array each time, shaving off precious microseconds.It's natural to write a function that looks like this@njitdef add(a, b):c = np.zeros(a.size)for i in range(a.size):c[i] = a[i] + b[i]return cwhere the result array, c, is created and initialized before it is populated.Often, functions are called repeatedly with arguments of the same shape. (The fact that they are called so often is what makes them appealing targets for speeding up with Numba.) When that is the case, it's possible to use a shortcut.@njitdef add(a, b, c):for i in range(a.size):c[i] = a[i] + b[i]where the result array, c, is created just once, outside the function, and reused. This way the memory space is preallocated and the function can get right to the business at hand.This is such a useful trick that NumPy uses it too. Most NumPy functions have an optional `out` parameter that you can use to pass a preallocated results array.The difference is typically just a small fraction of the total compute time, but it's a freebie–an optimization that comes with simpler code and logic. There's no downside! That's a rare thing. Letting it go unclaimed is like leaving the last bite of cheesecake just sitting on the table.
23
2 Comments
Chris A.
Software Engineer at PNNL (Center for AI, data.pnnl.gov)
5d
 Report this comment
IMO add(a,b) should be the default unless you've provided the code and know that it is the bottleneck.The downsides include (Google around for why people prefer to write sideeffect free functions/avoid global variables, no shortage of people who have written about it).
1Reaction
Yigit Mertol Kayabasi
Data Scientist / ML
4d
 Report this comment
🍰
1Reaction
To view or add a comment, sign in
More Relevant Posts

 Report this post
During the previous week, I had the chance to extensively work with three primary methods for identifying circular references within a dataset comprising nodes and edges. This dataset was notably substantial in size.I approached this task through the following means:SQL: Utilizing a declarative approach where I specified my desired outcomes.pandas DataFrame: Employing an imperative approach, granting me control over the calculation methods.Finally, I tackled the issue by leveraging linear algebra matrices, using numpy for operations on coordinate lists (COO) and employing compressed sparse row (CSR) matrices to find a resolution.
12
Like CommentTo view or add a comment, sign in

G Yuvaraj
Data Analyst  Microsoft SQL Server  Microsoft Excel  PowerBI  Python Programming Language  Numpy 
 Report this post
Array Functions:import numpy as np#Meanarr = np.array([[1, 2, 3], [4, 5, 6]])mean = np.mean(arr)print("Mean:", mean)import numpy as np#Sumarr = np.array([[1, 2, 3], [4, 5, 6]])sum_total = np.sum(arr)print("Sum:", sum_total)# Output: 21#min and max valuesarr = np.array([[1, 2, 3], [4, 5, 6]])min_value = np.min(arr)max_value = np.max(arr)print("minvalue:", min_value)print("maxvalue", max_value)#Medianarr = np.array([[1, 2, 3], [4, 5, 6]])median = np.median(arr)print("median value is:" ,median)#standard deviationarr = np.array([[1, 2, 3], [4, 5, 6]])std_dev = np.std(arr)print("std_dev value is:" ,std_dev)#variationarr = np.array([[1, 2, 3], [4, 5, 6]])variance = np.var(arr)print("the variance value is:" ,variance)output:Mean: 3.5Sum: 21minvalue: 1maxvalue 6median value is: 3.5std_dev value is: 1.707825127659933the variance value is: 2.9166666666666665
Like CommentTo view or add a comment, sign in

Harshal Kate
DATA SCIENCE  ANALYST  PYTHON SQL POWER BI  EDA  DATA VISUALIZATION
 Report this post
⏺ Learned about : Analysis of variance, or ANOVA, isa statistical method that separates observed variance data into different components to use for additional tests. A oneway ANOVA is used for three or more groups of data, to gain information about the relationship between the dependent and independent variables.▶ Performed the ANOVA test▶ Library use : numpy,scipy.stats
29
1 Comment
Like CommentTo view or add a comment, sign in

Kichere Magubu
Aspiring Data Scientist.  Content Creator.
 Report this post
pandas.pivot:Return reshaped DataFrame organized by given index / column values.copy the code below to follow alongdf = pd.DataFrame({'sex': ['male', 'male', 'male', 'female', 'female','female'],'skill': ['sing', 'cook', 'swim', 'sing', 'cook', 'swim'],'age': [25, 40, 32, 20, 34, 45],'pay': ['120', '100', '150', '170', '130', '200']})
2
Like CommentTo view or add a comment, sign in

Sai Saran
Computer Science Engineering
 Report this post
HI EVERYONE.TODAY the post is RESHAPE of Array and RANDOM function using array.first we will see that reshape the array with example.Ex:import numpy as nsarr=ns.arr([1,2,3,4,5,6,7,8,9,10])newarr=arr.reshape(2,5)print(newarr)Output:[[ 1 2 3 4 5] #2,5 means 2 rows 5 columns[6 7 8 9 10]]RANDOM function:Now we will see the example of random function using array.Ex:import numpy as ns # import the numpy and ns is variable nameh=ns.random.random([3,2])print(h)Output:[[0.55982 0.23175] # it will print the random values[0.68978 0.41294] # random.random is the function to create random [0.08388 0.56146]] # values
1
Like CommentTo view or add a comment, sign in

Ian Martin Ajzenszmidt
Information Technology and Services Professional and Political Scientist (Retired  Not Working since 1989).
 Report this post
import numpy as npimport matplotlib.pyplot as pltfrom scipy.optimize import fsolve# Define the equationsdef equations(vars): x, y = vars eq1 = np.exp(x)  y  2 eq2 = x**2 + y**2  1 return [eq1, eq2]# Initial guessinitial_guess = [0, 0]# Find the solutionsolution = fsolve(equations, initial_guess)# Plottingx = np.linspace(3, 3, 400)y = np.linspace(3, 3, 400)X, Y = np.meshgrid(x, y)F = np.exp(X)  Y  2G = X**2 + Y**2  1plt.figure(figsize=(8, 6))plt.contour(X, Y, F, levels=[0], colors='r')plt.contour(X, Y, G, levels=[0], colors='b')plt.plot(solution[0], solution[1], 'ko') # solution pointplt.xlabel('x')plt.ylabel('y')plt.title('Solution of Two Nonlinear Equations')plt.legend(['f(x, y) = 0', 'g(x, y) = 0', 'Solution'])plt.grid(True)plt.show()
Like CommentTo view or add a comment, sign in

Emmanuel Nnadi
Justicedbn4real (Data Scientist and Machine Learning Engineer)
 Report this post
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as sns%matplotlib inlinefrom sklearn.preprocessing import LabelEncoder### SECTION 5#Split the data into train and test datafrom sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split (X, y, random_state = 210, test_size = 0.20)print(X_train.shape)print(X_test.shape)print(y_train.shape)print(y_test.shape)X_train.head(5)X_train.tail(5)
Like CommentTo view or add a comment, sign in

Surendra Panpaliya
Corporate Trainer  Consultant  Generative AI, ML, DL  GitHub Copilot  Python, Django  Ruby on Rails  ReactJS  Data Science  Big Data  Hadoop  Scala  Spark  PySpark  Flask  AWS, DataBricks  Go lang, Gin
 Report this post
✳️ Go language cheat sheet ✳️ 🔰📚 Basic Syntax:Variables: var a int or a := 10Constants: const Pi = 3.14Functions: func add(x, y int) int { return x + y }🔁 Control Structures:IfElse: if x > 0 { ... } else { ... }Switch: switch day { case "Monday": ... }Loops: for i := 0; i < 10; i++ { ... }🧩 Data Structures:Arrays: var a [5]intSlices: s := []int{1, 2, 3}Maps: m := map[string]int{"foo": 1}🔗 Pointers:Pointer Declaration: var p *intPointer Usage: p = &i📦 Packages:Import: import "fmt"Export: func CapitalizedFunction() { ... }🛠️ Tools:Build: go buildRun: go run#golang #gocheatsheet #go
1
Like CommentTo view or add a comment, sign in

Muhammad Hamza
Computational Science & Engineering Simulations
 Report this post
The df.replace() method is a very powerful tool for cleaning and manipulating DataFrames in Pandas. It is important to understand how to use it effectively, as it can be very useful for preparing data for analysis and modeling.Here are some additional things to keep in mind when using the df.replace() method:Thedf.replace()method returns a new DataFrame by default. If you want to modify the original DataFrame in place, you need to set theinplaceparameter toTrue.Thedf.replace()method can be used to replace values in a single column, multiple columns, or the entire DataFrame. To replace values in a single column, simply pass the name of the column to thedf.replace()method. To replace values in multiple columns, pass a list of column names to thedf.replace()method. To replace values in the entire DataFrame, passNoneto thedf.replace()method.Thedf.replace()method can be used to replace values by value, by regular expression, or by dictionary. To replace values by value, simply pass the old value and the new value to thedf.replace()method. To replace values by regular expression, pass a regular expression and the new value to thedf.replace()method. To replace values by dictionary, pass a dictionary with the old values as the keys and the new values as the values to thedf.replace()method.#replacemethod #pythonprogramming
Like CommentTo view or add a comment, sign in

Maksym Kalaidov 🇺🇦
Head of Applications at Neural Concept  Germany  I mentor aspiring CAE Data Scientists
 Report this post
After using Pandas for 4 years, I can't remember what is the difference between pd.concat(), pd.merge(), and pd.DataFrame.join()So I made this diagram to look it up. It's a short and simplified summary of this brilliant StackOverflow thread https://lnkd.in/e_gvTxbk as well as this extensive Pandas documentation page https://lnkd.in/eHqTz2yX. While this diagram covers the most common usecases for me, I highly encourage you to check out the docs as there is so many more details.Some additional notes:1. pd.concat() only joins on index, pd.merge() joins on column keys, and pd.DataFrame.join() joins the other on index only2. pd.concat() is basically an equivalent of np.vstack() and np.hstack()3. You can use pd.merge() for all these operations, just pay close attention to the arguments
46
2 Comments
Like CommentTo view or add a comment, sign in
132,295 followers
 1,703 Posts
View Profile
FollowExplore topics
 Sales
 Marketing
 Business Administration
 HR Management
 Content Management
 Engineering
 Soft Skills
 See All