IndentationError: unexpected indent - Anaconda - The freeCodeCamp Forum

相关文章推荐

英俊的蟠桃 · [桜都字幕组] Show ...· 3 周前 ·

机灵的枇杷 · 中国气象局-天气预报-预警信号· 5 月前 ·

文质彬彬的打火机 · 2023广西高考志愿填报时间（含2020-2 ...· 11 月前 ·

至今单身的橙子 · 南京夫子庙小吃· 1 年前 ·

帅气的猴子 · 燕双鹰- 知乎· 1 年前 ·

# Binarize the labels # print(class_names) # lb = label_binarize(y = y, classes = list(class_names)) # classes.remove('unknown') # lb.fit(y) #for LabelBinarizer not lable_binerize() # lb.classes_ #for LabelBinarizer not lable_binerize # Split the training data for cross validation (X_train, X_test), (y_train, y_test) = train_test_split(X, y, test_size=0.2, random_state=0) df_y_train = pd.DataFrame(y_train, columns=['label']) #,'Date','group_idx']) print('df_y_train.shape', df_y_train.shape,'X_train', X_train.shape) ##### Dimensionality Reduction ####

Error Message::
File "<ipython-input-50-1c94ab12f530>", line 10
    (X_train, X_test), (y_train, y_test) = train_test_split(X, y, test_size=0.2,
IndentationError: unexpected indent
              Hello fngwira.
I have edited your post for readability. In the future, use Markdown to format your posts, by placing any code in between backticks (`).

To answer your question:

Remove the parentheses around your split output variables.

X_train, X_test, y_train, y_test = ...
Hope this helps
              @Sky020 the error message is still there:
File “”, line 10

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

IndentationError: unexpected indent
              This is my Code:::
def ML_with_CV_feat(cv_feat_file='../data/cv_feat.csv', n_comp=100, 
                    plotting=False):
    # Importing the bottleneck features for each image
    feat_df = pd.read_csv(cv_feat_file, index_col=0, dtype='unicode')
    ##-- Dealing with NaN
    feat_df.fillna(0, inplace=True)  
    feat_df['blob_detected'] = feat_df['blob_detected']*1
    #['cell_area', 'cell_eccentricity', 'cell_solidity', 'average_blue', 'average_green', 'average_red', 'blob_detected', 'num_of_blobs', 'average_blob_area']
#    feat_df = feat_df.sample(frac=0.01)
    feat_df.drop(columns=['cell_area', 'cell_eccentricity', 'cell_solidity',
                           'average_blue', 'average_green', 'average_red'],
                 inplace=True)
    #Removing features that do not seperate populations of cell class
    column_names = feat_names = list(feat_df.columns)
    print(column_names)
    for X in ['label','fn']:
        feat_names.remove(x)
#    feat_df = feat_df.iloc[0:300,:]
    mask = feat_df.loc[:, 'label'].isin(['Infected', 'Uninfected'])
    feat_df = feat_df.loc[mask, :].drop_duplicates()
    print('Number of features:', len(feat_names))
    y = feat_df.loc[:,['label']].values
    print(type(y), y.shape)
    print('Number of samples for each label \n', feat_df.groupby('label')['label'].count())
#    print(feat_df.head())
    X = feat_df.loc[:, feat_names].astype(float).values
    print('/nColumn feat names after placing into X',
          list(feat_df.loc[:, feat_names].columns))
class_names = set(feat_df.loc[:,'label'])
    # Binarize the labels
    # print(class_names)
#    lb = label_binarize(y = y, classes = list(class_names))
    # classes.remove('unknown')
    # lb.fit(y) #for LabelBinarizer not lable_binerize()
    # lb.classes_ #for LabelBinarizer not lable_binerize
    # Split the training data for cross validation
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, 
                                                        random_state=0)
    df_y_train = pd.DataFrame(y_train, columns=['label']) #,'Date','group_idx'])
    print('df_y_train.shape', df_y_train.shape,'X_train', X_train.shape)
    ##### Dimensionality Reduction ####
Error Message:: File “”, line 10

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

IndentationError: unexpected indent
              If you want this inside the function ML_with_CV_feat():

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
Then, add however many spaces (indents) you need to this:

class_names = set(feat_df.loc[:,'label'])

So that it is the same level as all the other code inside the function.
If you do not want the split testing data to be defined inside the function, then make it the same indentation as the class_names variable.
In Python, the indentation of your code defines what section goes with another.
Hope this helps
              Use this:
def ML_with_CV_feat(cv_feat_file='../data/cv_feat.csv', n_comp=100, plotting=False):
    feat_df = pd.read_csv(cv_feat_file, index_col=0, dtype='unicode')
    feat_df.fillna(0, inplace=True)  
    feat_df['blob_detected'] = feat_df['blob_detected']*1
    #['cell_area', 'cell_eccentricity', 'cell_solidity', 'average_blue', 'average_green', 'average_red', 'blob_detected', 'num_of_blobs', 'average_blob_area']
    #feat_df = feat_df.sample(frac=0.01)
    feat_df.drop(columns=['cell_area', 'cell_eccentricity', 'cell_solidity', 'average_blue', 'average_green', 'average_red'], inplace=True)
    column_names = feat_names = list(feat_df.columns)
    print(column_names)
    for X in ['label','fn']: #! THIS DOES NOT MAKE SENSE
        feat_names.remove(x) #CHOOSE TO USE 'X' OR 'x'...WHAT IS 'x'?
    #feat_df = feat_df.iloc[0:300,:]
    mask = feat_df.loc[:, 'label'].isin(['Infected', 'Uninfected'])
    feat_df = feat_df.loc[mask, :].drop_duplicates()
    print('Number of features:', len(feat_names))
    y = feat_df.loc[:,['label']].values
    print(type(y), y.shape)
    print('Number of samples for each label \n', feat_df.groupby('label')['label'].count())
    X = feat_df.loc[:, feat_names].astype(float).values
    print('/nColumn feat names after placing into X', list(feat_df.loc[:, feat_names].columns))
    class_names = set(feat_df.loc[:,'label'])
    # print(class_names)
    #lb = label_binarize(y = y, classes = list(class_names))
    # classes.remove('unknown')
    # lb.fit(y) #for LabelBinarizer not lable_binerize()
    # lb.classes_ #for LabelBinarizer not lable_binerize
    # Split the training data for cross validation
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
    df_y_train = pd.DataFrame(y_train, columns=['label']) #,'Date','group_idx'])
    print('df_y_train.shape', df_y_train.shape,'X_train', X_train.shape)
    ##### Dimensionality Reduction ####
Try that. Look out for my comments that I added in CAPITAL LETTERS
As the error message indicates, you have an indentation error . This error occurs when a statement is unnecessarily indented or its indentation does not match the indentation of former statements in the same block. Python not only insists on indentation, it insists on consistent indentation . You are free to choose the number of spaces of indentation to use, but you then need to stick with it. If you indent one line by 4 spaces, but then indent the next by 2 (or 5, or 10, or …), you’ll get this error.
However, by default, mixing tabs and spaces is still allowed in Python 2 , but it is highly recommended not to use this “feature”. Python 3 disallows mixing the use of tabs and spaces for indentation. Replacing tabs with 4 spaces is the recommended approach for writing Python code .
              Hi @fillermark!
This post has not been active for over a year.
Please only reply to  newer topics.
Thanks!