apprentissage profond

invite653bdf03 · 24/08/2018, 20h27

Salut, je suis un amateur de l'apprentissage profond est j'essaye d'implémenter un CNN pour la reconnaissance de chiffre manuscrit (mnist) mais jais une question sur la phase de classification (MLP) j'aimerais savoir si sur la phase d'apprentissage au final j'enregistre plusieurs poids (c'est a dire 10 paramètres MLP 1 POUR CHAQUE CLASSE) ou bien qu'un seuls poids pour toute les classes Merci.

invite6c250b59 · 24/08/2018, 21h41

Les deux seraient possibles, mais la solution usuelle est d'avoir un seul ensemble de poids pour toutes les classes.

invite653bdf03 · 24/08/2018, 21h54

Merci mais mon problème c'est qu'avec un seule ensemble de poids pour toute les classe les poids sont optimaux juste pour la dernière entrée (derniers numéro) donc c'est possible d'avoir 10 ensemble de poids un pour chaque classes

invite6c250b59 · 24/08/2018, 22h04

Ce n'est pas normal. Tu as probablement un bug majeur qui traine. Peux-tu montrer ton code?

A voir en vidéo sur Futura · Aujourd'hui

invite653bdf03 · 24/08/2018, 22h11

oui le voila

Code:

/*
 * To change this license header, choose License Headers in Project Properties.
 * To change this template file, choose Tools | Templates
 * and open the template in the editor.
 */
package MLP;

import javax.swing.table.DefaultTableModel;

/**
 *
 * @author Givenchy
 */
public class MLP {
    
    public static  boolean arret = false;
    public static int nInputs, nHiddens, nOutputs;  // Number of neurons in each layer
    public static float[] input, hidden, output;

    public static float[][] weightL1,  // Weights values of connection between neuron "j"
                                                //     from hidden layer and "i" from input layer
                                     weigthL2;  // Weights values of connection between neuron "j"
                                                //     from output layer and "i" from hidden layer
    public static float learningRate = 0.5f;//Taux d'apprentissage
    
    //--------------------------------------------------------------------------
    public static float Epsilon   = 0.00001f;
    public static int    nbIter    = 100;
    //--------------------------------------------------------------------------

    /** 
     * Creates a new instance of MLP.
     *
     * @param nInput number of neurons at input layer
     * @param nHidden number of neurons at hidden layer
     * @param nOutput number of neurons at output layer
     */
    public static void MLP(int nInput, int nHidden, int nOutput, DefaultTableModel pi, DefaultTableModel pf) 
    {

        
        nInputs = nInput;
        nHiddens = nHidden;
        nOutputs = nOutput;

        input = new float[nInput];
        hidden = new float[nHidden];
        output = new float[nOutput];

        weightL1 = new float[nHidden][nInput];
        weigthL2 = new float[nOutput][nHidden];

        
    //    pi.setColumnCount(0);
       
   //     for(int i=0;i<nInput;i++) pi.addColumn("W"+(i+1));
    //    pi.setRowCount(nHidden);
        
    //    pf.setColumnCount(0);
        
    //    for(int i=0;i<nInput;i++) pf.addColumn("W"+(i+1));
    //    pf.setRowCount(nHidden); 
        
        // Initialize weigths
        
        generateRandomWeights(pi);
        
        
    }


    /**
     * Set the learning rate for training.
     *
     * @param lr learning rate
     */
    public void setLearningRate(float lr) {
        learningRate = lr;
    }


    /**
     * Initialize weights with random values between interval [-0.5,0.5[
     */
    
    private static void generateRandomWeights(DefaultTableModel W) {
        
            for(int j=0; j<nHiddens; j++)
            for(int i=0; i<nInputs; i++) 
            {
                weightL1[j][i] = (float) (Math.random() - 0.5);
              //  W.setValueAt(weightL1[j][i], j, i);
              // System.out.println(j+":"+i+"     WI : "+weightL1[j][i]);
             }
           // System.out.println();
            
            for(int j=0; j<nOutputs; j++)
            for(int i=0; i<nHiddens; i++) {
            weigthL2[j][i] = (float) (Math.random() - 0.5);
            //System.out.println(j+"-"+i+"    WJ : "+weigthL2[j][i]);
                  
        }
        /*
                for(int i=1;i<nHiddens;i++)
                    for(int j=0;j<nInputs;j++)
        {
            System.out.println(""+weightL1[i][j]);
        }
                        for(int j=1; j<=nOutputs; j++)
            for(int i=0; i<=nHiddens; i++) {
                System.out.println(""+weigthL2[j][i]);
        }
        */
    }


    /**
     * Train the network with given a pattern.
     * The pattern is passed through the network and the weights are adjusted
     * by backpropagation, considering the desired output.
     *
     * @param pattern the pattern to be learned
     * @param desiredOutput the desired output for pattern
     * @return the network output before weights adjusting
     */
    public static float[] train(float[][] pattern, float[] desiredOutput,DefaultTableModel wf) 
    {
        
        int it=0;
        
        while((it<nbIter))//&&(arret==false))
        {  
                /*float[]*/ output = passNet(pattern);
                
                
                rcm.Fmlp.ZERO.setText(""+output[0]);
                rcm.Fmlp.UN.setText(""+output[1]);
                rcm.Fmlp.DEUX.setText(""+output[2]);
                rcm.Fmlp.TROIS.setText(""+output[3]);
                rcm.Fmlp.QUATRE.setText(""+output[4]);
                rcm.Fmlp.CINQ.setText(""+output[5]);
                rcm.Fmlp.SIX.setText(""+output[6]);
                rcm.Fmlp.SEPT.setText(""+output[7]);
                rcm.Fmlp.HUIT.setText(""+output[8]);
                rcm.Fmlp.NEUF.setText(""+output[9]);                
                System.out.println("SORTIE N° 0 :"+output[0]+"     SORTIE N° 1 :"+output[1]+"     SORTIE N° 2 :"+output[2]+"     SORTIE N° 3 :"+output[3]+"     SORTIE N° 4 :"+output[4]+"     SORTIE N° 5 :"+output[5]+"     SORTIE N° 6 :"+output[6]+"     SORTIE N° 7 :"+output[7]+"     SORTIE N° 8 :"+output[8]+"     SORTIE N° 9 :"+output[9]);
         
                
                backpropagation(desiredOutput,wf);
                

                
                
        it++;
        }
        System.out.println("it =================="+it);
        //-----------------------------------------
     //    System.out.println("SORTIE-FIN N° 0 :"+output[0]+"     SORTIE-FIN N° 1 :"+output[1]);
        
        //-----------------------------------------
        return output;
    }


    /**
     * Passes a pattern through the network. Activatinon functions are logistics.
     *
     * @param pattern pattern to be passed through the network
     * @return the network output for this pattern
     */
    public static float[] passNet(float[][] pattern) {

        for(int i=0; i<nInputs; i++) {
            input[i] = pattern[i][1];
        }
        
        // Set bias
        input[0] = (float) 1.0;
        hidden[0] = (float) 1.0;

        // Passing through hidden layer
        for(int j=0; j<nHiddens; j++) 
        {
            hidden[j] = (float) 0.0;
            for(int i=0; i<nInputs; i++) 
            {
                hidden[j] += weightL1[j][i] * input[i];
            }
            hidden[j] = /*Math.max(0, hidden[j]);*/  1.f/(1.f+(float)Math.exp(-hidden[j]));
        }
    
        // Passing through output layer
        for(int j=0; j<nOutputs; j++) 
        {
            output[j] = (float) 0.0;
            for(int i=0; i<nHiddens; i++) 
            {
                output[j] += weigthL2[j][i] * hidden[i];
       	    }
            output[j] = /*Math.max(0, output[j]);*/ 1.f/(1.f+(float)Math.exp(-output[j]));
        }
//------------------------------------------------------------------------------
      System.out.println();
      for(int k=0;k<output.length;k++)
      System.out.println(k+" : sortie ===> "+output[k]);
      System.out.println();
//------------------------------------------------------------------------------
        return output;
    }


    /**
     * This method adjust weigths considering error backpropagation. The desired
     * output is compared with the last network output and weights are adjusted
     * using the choosen learn rate.
     *
     * @param desiredOutput desired output for the last given pattern
     */
    private static void backpropagation(float[] desiredOutput,DefaultTableModel wf) 
    {

        float[] errorL2 = new float[nOutputs];
        float[] errorL1 = new float[nHiddens];
        float Esum = (float) 0.0;

        for(int i=0; i<nOutputs; i++)  // Layer 2 error gradient
        errorL2[i] = (float) (output[i] * (1.0-output[i]) * (desiredOutput[i]-output[i]));
	    
               
        for(int i=0; i<nHiddens; i++) 
        {  // Layer 1 error gradient
            for(int j=0; j<nOutputs; j++)
                Esum += weigthL2[j][i] * errorL2[j];

                errorL1[i] = (float) (hidden[i] * (1.0-hidden[i]) * Esum);
                Esum = (float) 0.0;
        }
         //---------------------------------------------------------------------
         Esum = (float) 0.0;
         for(int e=0;e<errorL1.length;e++) Esum  = Esum + errorL1[e]*errorL1[e];
         Esum = (float) (0.5*Esum);
        System.out.println("Esum ============================== "+Esum+"            Epsilon : "+Epsilon); 
         if( Esum <= Epsilon) 
         {
             
             arret=true;
             
                         for(int j=0; j<nHiddens; j++)
                         for(int i=0; i<nInputs; i++)
                         {
                            weightL1[j][i] += learningRate * errorL1[j] * input[i];
                            //wf.setValueAt(weightL1[j][i], j, i);
                         }
         }
        
                //---------------------------------------------------------------------
          if(arret==false)
         {
         
                     for(int j=0; j<nOutputs; j++)
                     for(int i=0; i<nHiddens; i++)
                     weigthL2[j][i] += learningRate * errorL2[j] * hidden[i];
         
                         for(int j=0; j<nHiddens; j++)
                         for(int i=0; i<nInputs; i++)
                         {
                            weightL1[j][i] += learningRate * errorL1[j] * input[i];
                           // wf.setValueAt(weightL1[j][i], j, i);
                         }
         }

                
    }
    
    
}

invite6c250b59 · 25/08/2018, 14h26

Alors ca c'est du bon vieux reseau frais sorti des annees 90... nostalgie!

C'est l'ancetre direct de l'apprentissage profond, mais ce n'est pas de l'apprentissage profond. Pour que c'en soit il faudrait avoir plus de couches, mais si tu mets plus de couches l'apprentissage va jammer car les fonctions logistiques donnent un gradient exponentiellement petit en fonction du nombre de couches. Pour s'en sortir il faudra que tu ajoutes un ou tous les trucs typiques de l'apprentissage profond : partage des poids (convnet), dropout, fonction relu et batchnorm, etc...

Pourquoi ne pas utiliser des outils plus modernes, genre theano/tensorflow/pytorch? Tu vas passer beaucoup de temps a reinventer la roue. Quelques pistes pour ton bug: 1) le learning rate semble beaucoup trop eleve. Tu n'aurais pas oublie de le multiplie par epsilon? 2) est-ce que les inputs sont bien conditionnees? 3) est-ce que ka presentation est aleatoire? Aussi, je te suggere de finir avec un softmax afin que l'activation terminale converge vers la probabilite d'appartenance a chacune de tes 10 classes, et d'avoir des outils pour regarder la convergence en fonction du nombre d'iteration.

apprentissage profond

apprentissage profond

Re : apprentissage profond

Re : apprentissage profond

Re : apprentissage profond

Re : apprentissage profond

Re : apprentissage profond

Discussions similaires

Apprentissage photo ciel profond, suite de l'aventure

ciel profond avc sky 150 750

Ciel profond à la PL1-M

ciel profond